On May 14, 2018, the Supreme Court struck down a federal law and legalized sports gambling in all states. Since then, 16 states have made the activity legal with 20 states being projected to do the same in the next three years. Sports betting is already very popular in America. With NFL and college football wagers alone, it tops nearly $95 billion each year. Overall, nearly $150 billion is wagered illegally on sports each year in the United States.
This has led to many companies in the sports industry trying to now capitalize off the previously illegal activity. But they’re not the only ones, more people in the U.S. are getting involved in sports betting to try to make some money on the side, and some have made it a full-time job. Now, the best way to make money off sports betting is to know exactly what happens in a game or a match. But that’s impossible, so the best you can do is try to predict the outcome as best you can. However, the question is, whether there is a way you can predict what happens in a game better than any other way.
That is what I have attempted and will show in this study. Using neural networks, a very powerful predictive modelling method, I will attempt to predict the performance of two NFL players — one of whom is a very consistent player (Tom Brady) and one fairly inconsistent (Todd Gurley) — to see how well the neural networks perform against different types of players.
What Are Neural Networks?
Neural networks are a set of algorithms, modelled loosely after the human brain, that are designed to recognize patterns that are too complex for a human. How they work is through a set of interconnected layers. The first layer is the input layer, where the data/variables are presented to the network. The data is then communicated to the middle layers which are called the ‘hidden layers’. This is where the actual processing is done via a system of weighted connections. The hidden layers then connect to an output layer where a given output is produced.
In summary, it maps inputs to outputs and finds correlations between the given inputs. That is the basis of a neural network, but in this study I decided to use three different kinds of neural networks to find the best predictor of a player’s performance. These three are the; Artificial Neural Network (ANN), Recurrent Neural Network (RNN), and Multivariable Recurrent Neural Network.
1. Artificial Neural Network (ANN)
An Artificial Neural Network is the most common type of neural network. It is mainly used for regression and classification. It takes data from certain variables, finds correlations within the data, and then gives an output according to that data.
An example of an Artificial Neural Network is when trying to predict the salary of someone in a company. The inputs into the ANN would be variables such as: age, job title, years with said company, etc. It is than passed to the hidden layer, which finds the correlations, then an output of the predicted salary is made.
2. Recurrent Neural Network (RNN)
A Recurrent Neural Network is quite different from an Artificial Neural Network. In an Artificial Neural Network the inputs and outputs are independent from each other. In a Recurrent Neural Network the output from the previous step is fed as input into the current step. This is because Recurrent Neural Networks are used for time series analysis. This means its purpose is to predict a value over a set of time given the data in the past.
An example of an RNN is predicting the stock price of a company. If we want to predict the stock price over the next 30 days, the stock price: for days, months, or years before are required. Using that data the RNN will predict the value and direction that the stock price will go over those next 30 days.
3. Multivariable Recurrent Neural Network
Multivariable Recurrent Neural Networks can be seen as a combination of an Artificial Neural Network and a Recurrent Neural Network. A Multivariable Recurrent Neural Network is a Recurrent Neural Network but with multiple inputs, not just one.
An example of this, is like the example above. The difference is that the inputs that are being remembered are not just the stock price. In this situation the inputs may also be the Volume, P/E, or the Beta.
To predict a player’s performance for an upcoming game, we need data that is already available before the next game, and the data that affects how a player performs. Therefore, since our two case studies are offensive players, the data that affects their performance and that is available, is the opposing team’s defensive data, if they play home or away, and a few other features. These features are going to be the inputs into our neural networks. The output, then is going to be their predicted fantasy points.
Fantasy points have become very popular in sport betting because these points dictate how well a player performs. The higher the amount of points, the better the player performed (and vice versa). Looking at the table, if a quarterback (QB) throws for three touchdowns, they get 12 points. If they throw for 250 yards, they get 10 points, and if they throw for one interception, -1 point. Therefore, just from a players’ fantasy points, you can dictate how well they performed during the game.
The test then is to predict each players fantasy points for the 2018–2019 season. Since this season has already passed, we can analyze how close the prediction was to the actual result. On top of using the three different prediction methods, we are also going to be analyzing the projected fantasy points of two of the top fantasy sports websites: CBS Sports Fantasy and Fantasy Pros. This is so we can compare how well our prediction methods performed against websites and companies that are dedicated to fantasy sports.
Case 1: Tom Brady (Consistent)
Tom Brady is currently the quarterback of the New England Patriots. Over the last decade he has been recognized as one of the top quarterbacks in the league. He is the only player in history to win six super bowls and has won three league MVP awards. He is a Hall of Fame player and one of the best to ever do it. The reason Tom Brady was picked for this study is because he has years of data available, and he is a very consistent player. An example of this, is the fact that he has led his team to the playoffs every year for the past decade.
In summary, this is how the different neural networks are going to work for Tom Brady. The ANN will use the opposing team’s defensive stats over the past seven years (with some other features) and the fantasy points from those games to predict the fantasy points for the 2018-2019 season. The RNN will only focus on the fantasy points of the past few games to see a pattern and then predict the next games fantasy points for Tom Brady. Finally, the Multivariable RNN will combine the two, so the multiple of features explained above are remembered over the past few games in order to predict the next game’s fantasy points.
The bar graph above shows the predicted fantasy points for the three prediction methods, Fantasy Pros, CBS Sports Fantasy, and Tom Brady’s real fantasy points over each of the 16 games in the 2018–2019 season. Since this may be hard to see how accurate each method is, the summary of the data above is in the table below.
Analyzing the Results
Looking at the results, Fantasy Pros was the best predictor followed by Multivariable RNN, ANN, RNN, and than CBS Sports.
When viewing the data and the results, there is a reason they are in that order. Starting with our neural networks, RNN was the worst out of the 3. RNN comes in handy in this situation when a player has an injury or doesn’t play well in a particular month. It will see the pattern that the player isn’t performing well after an injury (or in a current month) and adjust accordingly. But since Tom Brady is a very consistent player, who gets few to no injuries, the only thing that really affects his game is the team he’s going up against. That’s where the ANN comes in handy.
The ANN was the second best out of the neural networks, and it’s because it takes just the opposing team’s defensive stats and gives out the predicted fantasy points based off those stats. But the problem with the ANN in this situation is that if a player was injured or plays badly during the winter it wouldn’t take that into consideration like the RNN. And that’s the reason why the Multivariable RNN has the best accuracy out of the three. It’s because it’s a combination of the ANN and the RNN.
Now, looking at the other two, the projections of the two fantasy sites are at opposite spectrums. Firstly, CBS Sports had the worst accuracy. The way they project their fantasy points is from experts in the company, and unfortunately these experts weren’t too accurate. Fantasy Pros had the best accuracy, and they project their fantasy points by using advice from over 100+ experts. This explains why Fantasy Pros beat our neural networks by a small margin in accuracy. Since again, Tom Brady is a very consistent player, it’s easier to predict his performance than that of someone who is inconsistent. Therefore, these experts were able to beat our neural networks by a small margin in this case.
Case 2: Todd Gurley (Inconsistent)
Todd Gurley is currently the running back of the Los Angeles Rams. He has been in the league for four years now and has been a solid running back for the Rams. When he entered the league, he had a breakout rookie season but in his second year took a step back production wise. Over the last two seasons though, he has appeared to get better with some sparks of greatness in games. However, there have also been games where he has produced nearly nothing on the field. Todd Gurley is a very interesting case for he either performs greatly or poorly. This makes him the perfect candidate to try the same test that was used on a consistent player (Tom Brady) but now on an inconsistent player.
In summary this is how the different neural networks are going to work for Todd Gurley. The ANN will use the opposing teams defensive stats over the past three years (with some other features) and the fantasy points from those games, to predict the fantasy points for the 2018–2019 season. The RNN will only focus on the fantasy points of the past few games to see a pattern, and than predict the next games fantasy points for Todd Gurley. Finally, the Multivariable RNN will combine the two, so the multiple of features explained above are remembered over the past few games, in able to predict the next games fantasy points.
The bar graph above shows the predicted fantasy points for the three prediction methods, Fantasy Pros, CBS Sports Fantasy, and Todd Gurley’s real fantasy points over each of his 14 games in the 2018–2019 season. Since this may be hard to see how accurate each method is, the summary of the data above is in the table below.
Analyzing the Results
Looking at the results Multivariable RNN was the best predictor followed by Fantasy Pros, RNN, ANN, and than CBS Sports.
The order of most accurate for Todd Gurley is quite different from Tom Brady, and there is a reason for this. Projections from CBS still remained at the bottom, but this time RNN was better than ANN. The main reason for this is because since Todd Gurley is such an inconsistent player, he has games where he performs great against really good defensive teams and games where he performs terribly against really bad defensive teams. This ends up messing with the ANN, leading it not to be that accurate. Now looking at the RNN, like I said above, the RNN comes in handy when a player has an injury. This is because it can detect that after getting the injury, the player doesn’t perform that well and predicts lower than what it regularly would. And unlike Tom Brady, Todd Gurley has had a number of injuries. In fact, over his four-year career, he has only played all 16 games in a season once. There go, this leads to the RNN being more accurate than the ANN.
The top two predictors are the same as for Tom Brady, but in this case the Multivariable RNN took the top spot. The reason simply is that Todd Gurley’s fantasy points per game were much harder to predict than Tom Brady’s. With Todd Gurley being the more inconsistent player, the experts that Fantasy Pros rely on couldn’t predict as accurately as they did for Tom Brady. The Multivariable RNN was able to see patterns that normal humans couldn’t and, even with the inconsistency of the stats, was able to outperform the Fantasy Pros experts.
In the end neural networks showed how powerful they truly are. In both cases each of the three neural networks was able to predict above 70% accuracy, with the best predictor neural network being around 75%. In addition, they were not only able to compete with leading experts in the field, but they outperformed them.
Sports betting wise, neural networks can become a real asset. As we saw, they can predict uncertain events and/or performances better than leading experts. When you predict on unsure events in sports betting, it’s riskier. This causes the winning prize to be much bigger in the end. Therefore, if you want to play it safe and get a small prize, keep listening to the experts on sport networks. But if you want to make some real money, neural networks can be the solution to increase your chances of winning that big prize. Furthermore, neural networks are just beginning to be discovered of their true potential.
Neural network research is quite new and becoming popular in the data science world, with new discoveries happening yearly. Thus, in a few years neural networks might be even better at predicting a player’s performance. But even at the present time they can become better at predicting. With the creation of advanced stats in sports and some other features, there is a possibility of increasing the accuracy of the neural networks with data that wasn’t present in this study.
If you would like to see the code for each of the Neural Networks and how they predicted the fantasy points, they can be found in the link below on my GitHub.