2016 AFL Predictive Model: Data Scientists’ Statistical Approach

Posted: March 23, 2016

Data Scientists


2016 AFL Predictions

Utilising a vast array of variables and proven statistical methods, the Data Scientists’ AFL Prediction Model takes a purely mathematical approach to predicting the outcomes of weekly matches.

A Random Forest was the chosen model built with a calculated Elo rating as one of several predictors. The 2016 AFL predicted probabilities can be used to identify value Back & Lay bets across the Betfair Exchange market.


Variables Utilised to Predict AFL

  • Elo Rating using the ratio of team scores as the outcome
  • Home vs Away performance
  • Venue performance
  • Week when game was played (captures fatigue through season)
  • Min, max and median player fantasy ratings per team
  • Recent performance against opposition (previous 3 games)
  • Recent performance at venue (previous 10 games)


Best Performing AFL Variable 

The main driver for the model (based on variable importance) was the Elo rating. Originally designed for calculating the relative skill level of players in chess, an Elo rating is characterised by a number which will fluctuate subject to results of games week to week. Developing an Elo rating improved the accuracy of the Model’s AFL Predictions significantly.


The Process: AFL Predictions

  • The Elo rating was calculated for each team from 2000 to 2015 using the ratio of scores as the outcome variable. In order to calculate the Elo rating, two variable were fitted using the optimx package in R, one variable for home advantage and the other the K-factor. The K-factor influences the number of Elo points taken from and given to the opposing teams.
  • The Elo rating was then used as one of several predictors in the random forest classification model (win or lose). The random forest model was trained on 70% of the data using the caret package in R with repeated 5-fold cross-validation and ROC used to select the optimal model. The remaining 30% of the data was used to test the model,
  • After iterating through the training and testing of the model, the probabilities of winning each game for the 2016 season were calculated using the trained random forest model. Probabilities were adjusted to sum to one for each game and the odds calculated by inversing the probabilities.


Utilising AFL Predictions

These predicted AFL probabilities can be used to identify value Back & Lay bets across the Exchange market. Comparing the predicted probabilities to the market will outline opportunities where value may be present.

If Collingwood for example are priced as a $1.41 (70.92% probability) chance according to the Model and the market has them priced as a $1.61 (62.11% probability) chance then you’ve identified yourself a value Back bet on Collingwood. Determining the amount of your wager (staking) on Collingwood is vital to ensure you can optimise your value advantage.

Factors such as player injuries and weather aren’t considered in the assessments but can be manually considered in your wagering to give you an added edge.

Bet Smart with the Data Scientist predictions all season long.