The McNamara Fallacy and Streetlight Bias

Jack Houghton, in his latest Psychology of Betting piece, presents a 3 part series on the Streetlight Bias and Mcnamara Fallacy.

According to Jack, too many punters are only looking for the right data in the light when they should be looking in the dark. This can also lead into analysing numbers that don’t need to be looked at or are not fit for the purpose.

For the latest Psychology of Betting Articles, head to the Betfair Hub.


Get The Data

In the 2003 Errol Morris documentary, The Fog of War, ex-U.S. Secretary of Defence, Robert McNamara, sets out a series of eleven lessons gleaned from his life.

The sixth lesson, “Get the data”, recounts McNamara’s time at Ford Motor Company, and a discovery he made there that, of the 40,000 people who died in car crashes each year, the majority were caused by bodies hitting the steering wheel as a result of impact, as opposed to dying because of the impact itself.  Using this information to direct research and development, Ford later began fitting cars with seat belts, which went on to save numerous lives.

Whether McNamara’s recollection of this neat fable about the power of data is accurate or not is hard to know, but it will certainly chime with any number-savvy punter.  Indeed, it’s hard to recall an article I’ve written for this site that doesn’t demand the necessity of a data-led approach to punting as a means to combat the shortcomings of our mis-evolved, bias-ridden brains.


Dangers of Data-Bias

The desire for quantifiable data does have its dangers, though.  In an ironic turn, our longing for statistics can become insatiable; we can become data-biased.  We can start to believe that having any number – even if it is wildly suspect and baseless – is better than having no number at all.  And because data makes us feel confident – after all, the numbers don’t lie – we then act with a self-assurance and certainty than is unwarranted.

Instead of being aware of the shortcomings of the data we are using and acting with due caution, we believe our omniscience and proceed at speed – with disastrous consequences.

Anyone who has been involved with creating business cases will be familiar with the dangers of this data-bias.  It is often impossible to accurately predict how much a project will cost, or how much revenue it will generate.  There are usually a vast number of variables – many unknown at project inception – that are impossible to quantify with any certainty.  And yet there we sit, plucking numbers from nowhere, adjusting them along the way so that the final number looks big enough to persuade others to fund the initiative.

That’s not to say that business cases do not have value when used well.  Often, the process of creating them will illuminate what we don’t know, and highlighting these areas of ambiguity at the outset can lead to better decision-making as to a project’s risks, helping those carrying out the project to successfully mitigate against them.

However, it’s rare, in my experience at least, for those creating business cases to view them in this tentative way.  Instead of accepting what is not known, spreadsheets are enthusiastically populated with data, even when that data is inappropriate, inaccurate, or non-existent.

As punters, we need to reflect on – and guard against – this data-bias.  Whilst, for the most part, a numbers-led approach to betting will move us towards long-term profitability, we need to be wary of the relevance and quality of the data we are using, not getting hoodwinked into thinking that all numbers have the same utility.

Take horse racing.  Around the world, various news agencies provide data which is marketed as being helpful to punters.  For example, in the last decade or so there has been a trend for news agencies to produce measurements of trainer- and jockey-form.  In most cases these are grossly over-simplistic, often being little more than a percentage of winners-to-runners, and offering little genuine analysis of the highly-complex and multi-faceted phenomena they claim to measure. Dressed up with colour-coded graphics and a catchy brand-name, though, and these numbers can seem to provide a powerful insight to anyone who craves the “certainty” of data.

Robert McNamara, where we started this article, is often denounced for his obsession with data, with some historians claiming it led to many of the poor political decisions surrounding the Vietnam War.  As then Secretary of Defence, McNamara’s approach to handling the conflict is illustrative for any would-be profitable punter who might have an unhealthy craving for data-at-all-costs.


The McNamara Fallacy

Under McNamara’s direction, overly-simplistic measures were used to track the progress of the war and inform future policy.  It was believed that increasing North Vietnamese casualties, alongside increasing US troop numbers, would inevitably lead to victory.  The problem for McNamara and the US – unknown at the time – was that these measurements were a poor reflection of reality.  They failed to appreciate that the North Vietnamese had a ready supply of new combatants; they failed to question the accuracy of enemy body-count data; and they failed to appreciate that drafted American soldiers were not necessarily numerically equal to Viet Cong guerrillas.  The numbers might have told McNamara that he was winning; the reality was different.

The catastrophic decision-making of the Vietnam War has led some to name the data-bias at its heart after the person who encouraged it: the McNamara Fallacy now describes any behaviour where information is ignored unless it can be numerically quantified.

Anyone who bets in markets where quantifiable data is sparse might have some sympathy for McNamara, or at least understand the predicament he faced.  I certainly do.


My Dirty Secret

I don’t often admit to data-savvy punting friends – is that I bet, annually, on a UK talent show.  I have done so since its inception over 15 years ago.  Worst still, perhaps, I’ve done so profitably.  In fact, in percentage terms at least (volumes are low), it’s been the most successful area of my punting during that time.  So here I am on this site, continuously advocating that punters adopt a quantitative approach to punting if they are to improve their profitability, whilst my own punting performs best in markets where no such data is available.

Putting my embarrassment at this fact aside, though, what’s been interesting when betting in these markets is witnessing the McNamara Fallacy at work.  In the early years of the show, internet forums and polls sprung up to offer coverage and insight.  More latterly, contestants have had their own YouTube channels and Twitter feeds.  And punters have lapped up the data they provide – merrily basing their punting decisions on the results of these online polls and on the numbers of followers each contestant has been able to garner on social media.

It’s no surprise.  In a desert, a glass of urine can look like a cold beer; and in a betting market where no quantifiable data is available, anything that has a number attached to it can seem like a heavenly gift.  The data we really want – voting records from previous shows – is not publicly available, so instead we look for data elsewhere.


Streetlight Bias

This tendency – to focus our attention where it is easiest to look – is sometimes referred to as streetlight bias.  It is named after several well-worn anecdotes which have calamitous central characters only willing to search for answers where there is light available to them.  In more recent versions, the anecdotes feature a drunkard searching for his car keys under a streetlight, despite knowing that he lost them elsewhere.

In the case of the UK talent show that I bet on, we don’t get to see the data we really need, but as a streetlight shines elsewhere – online in the shape of polls, views, and followers – we are drawn to focus our attention there.  And whilst the data is interesting – and shouldn’t be immediately dismissed – it needs sceptical consideration.  Exactly what is it measuring?  And what bearing will it have on the show itself?  My assessment is that its use is limited, with high numbers generally sign-posting notoriety rather than likely success in the competition.  It might shine a light on something, but not on something that is especially useful to me as a punter.

Next time, I’ll look at how we should approach situations where the data we really want is not available, but in the meantime, a note to all punters: it’s right that we try and get the data – as McNamara encourages us to – but take some time to consider whether it’s the right data, make sure its accurate, and don’t ignore other information just because there is no light illuminating it.

Previously

In our last article, we explored the tendency for some punters to only look for data where it is easy to find – the streetlight bias – instead of accepting that the data they really needed was to be found elsewhere – in some dark recess away from the light – and, ultimately, might not be available at all.

We also examined how this tendency often coalesced with the McNamara Fallacy: the sometimes obsessional need to have a number to analyse, even when no such number exists, or when the numbers you do have are inaccurate, or not fit for purpose.

Within punting, this is perhaps seen most starkly in the way that many racing punters obsess over betting market information. The world over, betting “analysts” are employed to report market movements, telling us which horses are shortening and lengthening in price, and speculating as to the reasons for the shifts.

These shysters tell us, in hushed tones, that the “smart money” is on the second-favourite whose odds have tumbled, or that “connections” have lumped-on the outsider. Or we might learn that a horse whose odds are drifting in the market reflects concerns around a stable virus, or the jockey’s view that the draw is disadvantageous.


The Market knowing something

The problem with this kind of “analysis” – aside from the fact that it is, at best, creatively speculative: the desperate attempt of a misinformed soul, feeling the pressure to fill air time with something more that just reading out the latest odds – is that betting markets are only a measure of one thing: the cumulative effect of those backing and laying each horse within a race.

They tell us little else. By definition, then, any speculation as to the reason why all the people operating within the market have acted in their multitude of ways will be, at best, an oversimplification, and offer little useful insight.

This explains why, in a 2006 study I led, it was found that laying those horses in UK racing whose odds shortened by an implied percentage chance of 5% or more in the last two hours before the race would have led punters to a significant profit. The study concluded that punters were blindly following positive market moves in the belief that the market “knew something”, leading to artificially short odds that overestimated a horse’s chance of victory.

It should be noted that the study was theoretical and doesn’t necessarily offer an easy system for long-term profit. It would require a process that could measure these odds movements, bet (and be matched) at the last possible moment before the off, and be sufficiently automated to be applied to every race. It’s also an old study, and was limited to UK racing.

No, market moves tell us little of use. But punters obsess about them because they play to a belief and fear that others know more than we do – especially that there are those operating with infallible “inside information” – and that if we follow the market moves, we are, by proxy, gaining access to the advantages of that knowledge.


Objective Assessment

A similar phenomenon – an information cascade – happens in financial markets. Traders, seeing that others are buying or selling a stock, follow suit, artificially skewing the prices and, in extreme situations, causing bubbles, and the inevitable crashes which follow.

It’s little surprise, then, that we see these unfounded “gambles” in almost every racing market, where punters, craving data, naturally look where it is easy to find: in the betting data on the screen in front of them.

And we see a similar thing happening in political betting. If you are a punter who likes the “certainty” of data, these kinds of markets, by their very nature, are made for discomfort. They ask punters to make predictions about how a mass of people will vote on a given day (no easy task), and the information available to make this prediction – print, radio, television, and social media reports – is chaotically contradictory.

Which is why these kinds of punters will naturally gravitate towards opinion polls, because they seem to represent an objective aggregate assessment of the behaviour that you are trying to predict.


The Dangers

Opinion polls, though, are fraught with issues.  First, the sample sizes – typically around a 1,000 people – are small, and the mathematical jiggery-pokery that occurs after the sample is taken (which involves extrapolating the answers of the sample to the population as a whole based on demographic modelling) reduces validity.

Second, there is no guarantee that those polled in the sample will answer honestly.  Indeed, some analysts have argued that the failings of polls in the run-up to the 2016 US Presidential election were a result of participants not wanting to admit their support for a controversial candidate.

And third, many polls are carried out by media organisations who may have a political bias, suggesting that polls can sometimes be as much about influencing voter behaviour as they are about reporting likely behaviour.

Despite the limitations of polling data, though, betting markets react to each new poll instantly, faithfully believing that they represent some significant shift in voter-land. I’ve often wondered whether a profitable betting strategy would simply be to oppose whatever market move follows the release of a poll, in much the same way that the 2006 UK horse racing study pointed towards the profit to be had from unfounded moves in those markets.

What’s clear in all this – whether we’re looking at online forums predicting the outcome of reality television shows (as covered in my last article), horse racing market betting moves, or political opinion polls – is that we need to be wary of the data that is easily available to us, and not assume that any data is better than no data at all.

Perhaps it’s even worth having a series of questions that we use to interrogate the usefulness of any data.

1: What is the data measuring?

Body-count data in the Vietnam War didn’t measure the number of Viet Cong dead, it measured the amount that generals in the field wanted to report as dead to boost morale and satisfy their paymasters. Market movements in horse racing betting measures aggregate punter sentiment, not the likelihood of a horse winning.

And political opinion polls tell you how a minority of people that are reachable by phone and want to answer the questions of a pollster say they intend to vote on a day in the future; they don’t tell you how they, or anyone else, will vote.

Asking this fundamental question, then, is crucial.

As an example, an Australian rugby fan I know was so delighted with his team’s string of wins against New Zealand, Japan, and Wales in the 2017 end-of-year internationals that he placed a bet he couldn’t afford on them drubbing England.

His rationale was that the victories demonstrated that Australia were now the best team in the world.  The subsequent loss was hard to take, and his attempts to chase that loss with a large odds-on punt on Australia against Scotland in the next match turned out to be financially catastrophic.

The reality was that those Australian victories – two closely-fought matches against strong oppositions in difficult conditions, and an easy outpointing of an inferior Japan – were a measure of a team with marginally improved form from earlier in the season. They were not, as he thought, a measurement showing the firm establishment of a new rugby world-order.


2: What relevance does it have to the outcome of a particular betting market?

Some data passes the test of question one with relative ease, but stumbles at this point. Tennis statistics provide an edifying example of this. Awash with data, commentators will often fill airtime with illuminating metrics, such as the number of aces, first-serve winning percentages, and outright return winners.

As a punter, it can be easy to get carried away with such data, believing that a player who ratcheted-up 30 aces in a previous match is a sure thing to dominate his service games in the next match. There are two problems with this, though.

The first, and an irrefutable issue with all data, is that what happened in the past is not necessarily a good proxy for what will happen in the future. Just because it snowed yesterday, it doesn’t mean it will snow today; and just because a player served well in his last match, that is no guarantee that he will repeat the feat on a different day, against a different opponent, in different conditions.

The second issue is that ace-count has a relatively weak correlation with match victory professional men’s tennis. I know this because, in the past, I have crunched the numbers using regression analysis and found that the metrics of more interest are second-serve and second-serve return winning percentages: both of which have a far bigger influence on the outcome of a match than ace-count.

So, whilst the data you have might measure something accurately, that does not mean that what is being measured is especially helpful.


3: Can the data be converted into an implied percentage chance of winning (and therefore into implied betting odds)?

Knowing that second-serve and second-serve return winning percentages are highly correlative with match victories in tennis is one thing, but how do you convert that into betting odds?

If one player had double the success rate on these measures in his last match when compared to his future opponent, does that mean he has double the chance of winning?

Even leaving aside the obvious challenge that the measures were historic, established against different opponents under different conditions (see above), it should be clear that this kind of simplistic approach would be unlikely to bring long-term profit.


4: And perhaps most difficultly, how does the data fit with other data you have?

Which leads us on to this, the thorniest issue of working with data to predict the likely outcome of future events: how does one piece of data fit with other pieces, and how much weighting should you give to each?

Let’s say you have a statistically robust method of measuring a variety of aspects that may affect the outcome of a horse race: horse form, track conditions, draw bias, trainer form, jockey form. What do you do next? Which is the most important of those factors?

If a horse is head-and-shoulders clear on form, but has a bad draw, or the stable is out-of-form, how does that change your plan of action? Are some pieces of data actually measuring varying degrees of the same thing?


Where does all this leave us?

There are no easy answers to the questions above and that’s why, for some punters, taking a data-led approach to betting is so fascinating: attempting to solve the unsolvable riddles around using predictive data is a constantly rewarding intellectual pursuit.  For a majority, though, those riddles aren’t even understood, let alone tackled.

And the key point through all this is that, if we want to be profitable, we must be cautious.  The punter acting on no information is not especially disadvantaged against the punter acting on information that is poorly understood.  As we’ve seen, the confidence the latter gets from the belief that they have data that “proves” something of use can make them overconfident, with disastrous consequences.

Whenever presented with any data, then, it is worth repeatedly looping through the questions above.  Doing so will cause brain-ache, for sure, but don’t be afraid of that: view the pain as a way of protecting your wallet.


Related Articles

Illusion of Knowledge

Jack Houghton discusses the Illusion of Knowledge.

The Dangers of Inside Information

Jack Houghton takes an in depth look at the danger of following inside information in the latest addition to ...

Psychology of Betting

The Psychology of Betting Series by Jack Houghton is your one stop for all things mentally on punting.