In 2001 I spent many months creating an Elo rating system for professional snooker. After numerous false starts and failed attempts, I eventually settled on a method that was producing statistically reliable figures which helped me identify value bets for matches.
The problem was that, although the painstakingly produced statistics could give me a likelihood of a player winning a match, they couldn’t tell me the likely score, which limited the number of bets I could place.
The same problem is faced by soccer punters. They can use their ratings to work out that Manchester United has a 62% chance of winning a match against Arsenal, but they can’t work out the odds of a 1-1 draw.
The solution for most professional soccer punters is to use a Poisson probability distribution model, named after a French mathematician working in the early-19th Century.
The best way to understand Poisson distribution is to work through an example. If you are reasonably familiar with a programme like Excel, you should find this relatively easy to do.
Step 1: Cut and Paste the Data: Home and Away League Tables
Take a football league of your choice. The example below shows the final standings of the 2015-16 English Premier League. Search for “home/away split” and you’ll find lots of websites that have done the hard-work of compiling results for you:
Step 2: Calculate League and Team Goal Averages
Much of the data in these league tables is unnecessary for calculating a Poisson distribution, so the next step is to strip out what’s not needed and insert averages of goals for and goals against for every team, as well as calculating the averages for the league. Again, a bit of Excel knowledge makes this a quick process:
Step 3: Calculate Attacking and Defensive Metrics
Using the table above, you then divide the “Average Goals For” figure for each team by the average for the league. You do the same for “Average Goals Against”. Do this for both the Home and Away league table. What you then get are Attacking and Defending Metrics for each team, home and away. The higher a team’s Attacking Metric, the more goals they score; the higher a team’s Defending Metric, the more goals they let in:
Step 4: Calculate the Average Expected Goals in a Match
Let’s assume that Manchester City are playing West Ham at home. We need to calculate the average number of goals we would expect each team to score. To do this we follow the following formula:
Manchester City: Man City Home Attacking Metric (1.66) x West Ham Away Defending Metric (0.67) x League Average for Goals For at Home (1.49) = 1.66
West Ham: West Ham Away Attacking Metric (0.87) x Man City Away Defending Metric (0.92) x League Average for Goals For when Away (1.21) = 0.97
Step 5: Use an Online Calculator to Create a Poisson Distribution
Programmes like Excel have Poisson functions built in, but there are online calculators that can do much of the work for you. My favourite is this one: http://keisan.casio.com/exec/system/1180573180
In the box, “mean λ”, enter the Average Expected Goals for one of the teams. Set the “initial percentile” at 0 and the “increment” at 1. Set the “repetition” at 6. This will then calculate the likelihood of the team scoring anywhere between zero and five goals. Doing this for the example above results in the following:
Step 6: Calculate Correct Score Probabilities
To work out the chances of any score line, the final step is to multiply any two possibilities together. For example, to work out the probability of a 2-1 win for Manchester City, you would multiply 0.26 by 0.38. This is 0.1 (a 10% chance). This can be converted to betting odds by dividing 1 by 0.1 – $10.00.
With a bit of work, it is relatively straightforward to set up a matrix in Excel to calculate all possible outcomes:
If you then create the same matrix, and simply divide 1 by each of the number, you get the equivalent decimal odds:
The last step is to compare the odds of your model with the odds available in the betting markets. If the odds on a 1-0 win for Manchester City are better than $8.36, then you probably have a value bet.
What are the Advantages of Using the Poisson Distribution?
In the case of soccer, at least, it’s an incredibly accurate predictor of betting chance. Looking at data from the top leagues in England, Spain, Italy and France over several seasons, the suggested probability of each score outcome according to a Poisson distribution matches up to reality incredibly accurately. It seems to slightly underplay the chances of 0-0 draws, and slightly overplay the chances of 1-0 wins, but these variations are easy to consider when betting.
It also gives punters an alternative way of working out the probability in other betting markets. For example, in Match Odds markets, the odds of a home win can be worked out by adding up the probability of all the score-lines which result in a home win. A similar approach can be taken in Over/Under markets.
Perhaps the biggest advantage, though, is that it allows punters to become more detached from what they are betting on: only investing when they have quantitative data which tells them they are likely to be getting a value bet. This alone – shifting the mentality of a punter to the point where they only bet when the odds are in their favour – is the first step to becoming a profitable bettor.
What are the Limitations of the Poisson Distribution?
When adopting a Poisson approach, it is difficult to know the period over which you should calculate your averages. Taking soccer as an example again, limiting your data to a single season means that, in the early part of the season, you will be basing your bets on a small and potentially unreliable data set. However, extending your average in to the previous season (or seasons) means that you are introducing far more variables.
Teams remain relatively stable within any one season, but across seasons players and managers arrive and depart, and team-form can rise and fall accordingly. Basing your analysis on team data that is no longer relevant will decrease its accuracy. Having said that, in my experience, the advantages of having a larger sample of data seems to outweigh this factor.
Neither can a Poisson distribution consider other variables that may be valid. For example, cup competitions in many countries have been shown to produce a greater number of goals than expected. Also, derby matches are known to be more volatile and require special attention.
And key moments within seasons and games, like a team needing a win to avoid relegation, can result in tactical decisions which alter the expected distribution of outcomes. However, as long as you don’t use a Poisson distribution blindly, and are prepared to avoid games that may be more volatile, this risk can be mitigated.
Can it be Used in Other Sports?
Yes. After a lot of work, I used it successfully to predict the likely score of snooker matches. By also using it as an alternative method of calculating win odds, I could constantly check the reliability of the Elo ratings system I was using.
For me, soccer and snooker is as far as I’ve taken it. Specialising is crucial is you are to manage the workload of punting and remain profitable (not to mention maintaining a life away from it) and I won’t be branching out any time soon.
However, theoretically, a Poisson distribution can be used in any sport where two opponents score points against each other, whether that be tennis, cricket, rugby or Aussie rules. Now obviously, when it comes to cricket, with the possibility of high scores on both sides, you’re going to be creating some big spreadsheets. Having said that, if you’re skilled in that area there are some workarounds that can make life a bit easier. Better still, if you’re able to use databases, many of the processes can be automated.
Crucially, though, you must test the results that you are producing. Whenever I make changes to any betting model, I do this in two ways. First, I take an historical data set and see how accurate my model is at predicting what happened. Past success is no guarantee of future success, as variables alter, but it will at least increase your confidence that your model is sound. Second, I use my model to paper-trade for a period, to prove its profitability, before slowly increasing my stakes.
Using statistical approaches like this to assist your betting might seem like hard work, but if you like methodical problem solving, then you’ll probably enjoy it. And anyway, you wouldn’t invest money in a house, a car or a pension without doing a bit of research first, so why would you invest your money with Betfair any differently?