5 Reasons Why Predicting The Brownlow Is Hard

Since 2016, Fat Stats has been building statistical models with the aim of predicting the Brownlow Medal. Every year, they have tweaked the models and datasets with the aim of emulating the decisions of the umpires on game day and (hopefully) improving their models.

This is a hard task – every year they review our results and something different they haven’t thought about pops up. Fat Stats have decided to run through 5 things which make this particular prediction difficult. All of the plots shown are generated in the chaRlie app.


Were you interested in modelling the Brownlow? You need to enter Betfair’s $5,000 Datathon.


Umpires are Human too

Most umpires must love footy, otherwise they wouldn’t put up with the crap that’s thrown at them every weekend. You know what they say – love is blind. Like all of us, umpires are going to have things they like in the way a player plays and things they don’t care much about.

My favourite thing in footy is a full forward being hit on the chest by a 40m dart on a full speed lead. Other people like big tackles, a defender blanketing someone or in the case of GWS supporters – unnecessary contact to the face and eyes of star opposition players.

Try as they might, these biases are impossible to fully suppress on game day. Brownlow votes are allocated based on the collective decision of the umpires on the day, without seeing statistics. Every game has different configurations of umpires, in different moods with different likes and dislikes – this creates a lot of variance!

One famous example of an umpire howler occurred last year in round two when Marley Williams managed to poll 3 votes. In a game where he gathered 14 disposals, three marks, one clearance, three clangers and one free kick. Not surprisingly, the model gave him a 0% chance of a vote at all.

Similarly, for Jake Carlisle who also managed to scrape 1. In fact, the model was very clear and confident on where it thought the votes should go. Shaun Higgins and Ben Brown probably feel hard done by for this game.

chaRlie game output for round 2 2018 – Saints versus Kangaroos. Red is 3 votes, orange 2, and yellow 1 – predicted vote range is shown in green.


Times are changing

Football evolves over time, and what the umpires deem important when handing out votes is not an exception. The way Richmond won its flag in 2017 was very different to the Bulldogs the year before, and even more so than West Coast and Sydney in 2006. Like opposition coaches and players, the umpires react to these trends in their Brownlow vote allocation.

The above plots shows two things -firstly that players get 35 possessions or more much more frequently post 2005 than they did in the 90s (Greg Williams influenced 1990 – 1992 aside). Last year, 102 players achieved the 35 disposal mark in the regular season, with Tom Mitchell accounting for 10 of those.

The bottom plot shows that the umpires rewarded this feat for a while, but since a peak of 60% in 2013 (Gary Ablett driven) the percentage of 35 vote games that get three votes has dropped to almost 1 in 3 (bottom plot y axis is percentage).

There a few potential reasons for this; change in game plan, a focus on efficiency, Gazza getting old, multiple players in a game getting 35 touches etc.

However, when you use the previous 3-5 years of data to build a machine learning model it may not take into account these changes, leading to potential over-predicting on high disposal games. This has definitely happened for chaRlie in recent years.

Certain players can even influence the models. If ruckmen who poll votes are present in the dataset, then the model is much more likely to predict votes for ruckmen. If not, the opposite can occur. Balancing the dataset is key to a good prediction.


In Data We Trust

Not every component of an AFL game can be captured by data, far from it at the moment. When Jeremy Howe stands on Aaron Sandilands’ head for a mark, the degree of difficulty is not taken into account in statistics – he is allocated a mark and a contested mark, and potentially an intercept mark.

The same result he would get if he had bullied Caleb Daniel one on one on the wing. The timing of that mark is also not taken into account. Not that they voted on the game, but you know that if the umpires had walked into the rooms after the 2005 grand final with Leo Barry’s mark fresh in their minds, they might be a lot more likely to give him votes. The statistics, as they stand, will not capture the fact that the mark decided the game.

Research by Michael Bailey and later written about in the great Footballistics book showed that players with bald heads or tattoos are up to 2 times more likely to receive votes than their more nondescript teammates.

Not many people can be bothered creating that dataset and as such the models suffer from this measurable bias in the umpires attention. Data is steadily improving, with the AFL/Champion Data slowly releasing more and more for public consumption.

This will lead to improved models and insights into the game, but it will never capture every nuance of the complex and evolving game of AFL.


Role of the Media

“Whoever controls the media, controls the mind” – Jim Morrison

The above plot is the google trends results for the previous month (leading up to the Preliminary final weekend 2019). It shows that Toby Greene has had up to 75-100 times the google hits relative to Brodie Grundy, who also appears to be frequently discussed.

It is impossible for this saturation to not influence umpires, whether it be positively or negatively. We won’t know for sure, being finals, but it’s very unlikely that umpires are going to look at Toby positively next time he plays.

One famous example of this pre-existing bias due to a media situation is James Hird in 2004. After publicly criticizing umpire Scott McLaren (rightfully) on the footy show after the round 2 debacle, Hird was fined $20,000 and agreed to umpire promotion.

The next round, the umpires showed they hold a grudge by awarding Hird zero votes after one of the greatest last quarters of all time against West Coast. Hird got 34 disposals, 14 in the last quarter, 20 contested and kicked three goals including the iconic match winning snap from the pocket where he hugged a fan. Umpires hold a grudge.


Zero Sum Game

One of the things that is challenging about the Brownlow is someone has to get three votes, two votes and one vote. If you predict player A to get 3 votes, and they get 1, and player B to get 1 vote and they get 3, you aren’t two votes out – you are 4. In cases where you have two players on the same team who are thereabouts at the pointy end of the night, these kinds of games can have a huge effect on the end prediction results.

chaRlie round by round output showing three Western Bulldogs players in 2018. Predicted vote range shown in green and actual votes in red.

In 2018, the chaRlie model over-predicted Jack Macrae by 11 votes, its second biggest stuff up. A lot of the issue with the Dogs is that when the midfield is on, they are all on, meaning that its difficult to figure out how the umpires will rank them.

The example above shows that Jack Macrae was predicted to get 5 or 6 votes between rounds 19 and 21, and Marcus Bontempelli was expected to get around 3. The Bont polled 6 and Macrae 1, effectively an 7-8 vote turn around! Understanding the uncertainty the model has with particular games and players is key to its successful use.


Conclusion

Hopefully this blog has highlighted some of the things to think about when you go about your predictive modelling, however you do it.

Good luck, and may the Brownlow gods be with you like they were with Marley.


Related Articles

Betfair’s Brownlow Medal Datathon

Join Australia's data community in predicting the 2019 Brownlow Medal winner and put yourself in the running for up ...

Brownlow Medal eBook: Your Exclusive 2019 Guide

As the 2019 Brownlow closes in, get all of the top Brownlow Medal predictions in one place with The ...

AFL Prediction Model

Betfair’s internal team of Data Scientists have created an AFL Prediction Model. The model creates probabilities for every game. ...