Martin Ingram – Data Scientist
Martin is passionate about good code and gaining insights from data with a particular interest in sports analytics. He completed a Bachelor Of Arts, Natural Sciences (Physical) at Cambridge before completing his Masters of Science, Computing Science.
In this first article of a series, I give an overview of the different approaches commonly used to predict tennis matches. Broadly speaking, the different models fall into three categories: ranking-based models, regression-based models, and point-based models. These categories are not mutually exclusive — it is possible to combine all three — but each approach is distinct and interesting in its own way.
Ranking-based models predict matches by finding a way to rank players by skill, and then predicting the higher-ranked player to win. The simplest and most well-known ranking is the ATP Ranking for men and the WTA Ranking for women. This ranking is a good predictor, with the higher-ranked player winning 68.1% of matches on the ATP in 2014, for example. There is however an interesting and more predictive alternative to the world rankings.
This ranking is the Elo rating, named after its inventor Arpad Elo who originally developed it to rank chess players. The Elo rating starts out by giving each player the same Elo rating of 1500, which is then updated for each match they play. While the world rankings award player’s points for the stage and tier of the tournament they reach, the number of Elo points a player gains or loses depends on their opponent’s Elo rating.
If a player beats an opponent with a higher Elo rating than themselves, they gain more points than if they beat a player with a lower Elo rating. Conversely, if a player loses to a player with a lower Elo rating, they lose more Elo points than if they lost to a player with a higher rating.
The Elo system produces interesting illustrations. For instance, the plot below shows how the Elo ratings for the “Big 4” ATP players, Roger Federer, Novak Djokovic, Rafael Nadal and Andy Murray, developed from 2007 until the start of the US Open 2016. The ratings give insight into the relative strength of the players: throughout 2007, for example, Roger Federer was dominant, followed by Rafael Nadal, Novak Djokovic, and Andy Murray. Since Nadal’s decline from 2014 onwards, Novak Djokovic has led the field.
Thanks to his win at Wimbledon and the Olympics, Andy Murray now has the second-highest Elo rating, but the gap between him and Novak Djokovic is still large at present. Elo provides a formula for converting the difference in ranking points to a probability of winning, and his Elo of 2365 compared to Djokovic’s of 2515 means that Djokovic’s chance of winning is about 70%.
Elo ratings are particularly interesting as they produce very accurate predictions. Stephanie Kovalchik compares 11 published tennis models in her paper, including all the models mentioned in this article, an Elo model was more accurate than any other model for prediction (70% of matches predicted correctly on the ATP in 2014), with the exception of betting odds (72% of matches predicted correctly).
Regression-based models are useful when data is available that may be predictive of an outcome, but the precise relationship is not known. For example, a higher-ranked player is likely to win against a lower-ranked player, but what win probability does, say, a ten-point ranking difference correspond to?
A regression model can produce an estimate. Given factors that may predict the match outcome, the model will evaluate both whether the factors are really correlated with the outcome, and if so, by how much. One such analysis was done by del Corral and Prieto-Rodriguez in a paper from 2010 (Corral).
They built a model with 20 inputs, containing the ranking difference as well as factors relating to the stage of the tournament (which round), the tournament itself (which Grand Slam it was — they only consider the major tournaments), and differences among the players, such as whether one or both are left-handed or had been in the top 10 in the past. For the ATP, they find that the ranking difference is most important predictor, but other factors, such as age difference and their past performance at the tournament, also play a role. Others turn out to be insignificant, such as the difference in height.
Regression models perform well and are appealing as their accuracy largely depends on how good the inputs are. The best published models do not perform as well as the Elo model (68% correct on the ATP in 2014), but they may do so with better inputs. They can also be a powerful way to combine predictions from several models: for instance, a regression model using the Elo prediction together with additional variables may be an interesting model.
The final class of models widely used for tennis are point-based models. While Elo models can be applied to many sports and regression models are very general, point-based models are specifically designed around the rules of tennis.
Point-based models attempt to model a tennis match from the point level upwards. They assume that the probability of winning a point on serve is fixed throughout the match for each player. With this assumption, calculating the probabilities of winning a service game, set, and match is just a matter of summing all the possible ways of winning. The mathematics is fairly complex, but the equations can be solved very quickly on a computer.
The plot above shows an example of the equations. Fixing the opponent’s probability of winning a point on serve at 64% (the ATP tour average), it is possible to calculate the probability of a player winning the match for different serve-probabilities of their own. The dashed line shows the opponent’s win probability of 64%. If player 1 wins 64% of their points on serve, too, their probability of winning is exactly 50%. If they win 60%, however, it drops to about 31%; if they win 70%, it rises to 77%.
Assuming that the probability of winning a point on serve is constant throughout the match might seem a little simplistic — for example, it means that the model assigns the same win probability to a break point in the final set as it does to the very first point in the match, even though the break point likely involves a great deal more pressure on the server, which may influence the probability. Klaassen & Magnus studied whether the assumption holds, and while they found that it does not — servers are more likely to lose points under pressure, and effects like momentum play a role — they conclude that it is still a good approximation for prediction.
If we accept this assumption of constant probability (often referred to as the i.i.d. assumption), the challenge with point-based models is to accurately predict these probabilities for both players. One of the best models for doing this is Barnett & Clarke’s opponent-adjusted model (Barnett & Clarke), which combines player statistics — how well a player serves and returns compared to the average player — to estimate the serve-winning probabilities for a match. This model does about as well as the regression-based models for predicting match outcomes, i.e. it is good but falls short of Elo (67% accuracy for ATP matches in 2014).
Point-based models are interesting as they can produce a great deal of information about a match. As they are constructed from the point-level upwards, they easily generate probabilities over the number of sets, number of games and set scores, to name just a few. As an example, the plot above shows the prediction of most likely set scores for this year’s Wimbledon final. Andy Murray played Milos Raonic, and Barnett & Clarke’s model predicted serve-winning probabilities of 70.9% for Andy Murray and 67.0% for Milos Raonic. According to these and the i.i.d. model, scores of 7-6 and 6-4 were expected to be most likely. In this case, the model did very well: the match finished 6-4 7-6 7-6 for Murray.
This article described three ways a tennis match can be modelled. Each approach has its merits: the regression approach is very flexible, the Elo ranking is very accurate, and the point-based models give a wealth of information about each match.
This article is intended to be an overview. In the next article, I will describe the Elo model in more depth, with details on how it is calculated and how it can be tuned to perform well.