March 16, 2021

At least 37 people have suffered potentially life-threatening blood clots after taking the Oxford/AstraZeneca vaccine. That number, on its own, could easily convince you that the vaccine is dangerous. It’s a number that has misled entire governments. And it’s a prime example of how numbers go wrong, and how important it is to present them fairly.

Since it’s very likely that you are a British person, and therefore have either recently had or will soon have your first jab – and that it is very likely that that jab will have been the Oxford/AZ vaccine – I can understand if this makes you feel nervous. Especially since the Netherlands, Ireland, Denmark, Iceland, Norway, Spain, Thailand and Germany have all suspended or partly restricted the use of the jab.

It is a stupid, harmful decision; it will predictably lead to some avoidable deaths; and there is absolutely no good reason to think that the Oxford vaccine is linked in any way to blood clots. It’s also dangerous because it will give people the false impression that it is unsafe, even when these countries change their minds.

But it is very instructive. Because it shows us a simple, but common, way, in which numbers go wrong in the news: the failure to ask “Is this a big number?” After all, 37 people getting sick sounds really bad, on its own. But is it more than we would expect? What do we need to know to understand that?

We need two more numbers: one, how many people have been given the Oxford vaccine; and two, how many blood clots we would expect to see in that many people, if we hadn’t given them the vaccine. Luckily, we know those two numbers fairly well. About five million people have been given the Ox/AZ vaccine in Europe (about 17 million worldwide, according to AstraZeneca); and about one person in every 1,000 suffers a thrombosis every year.

So you’d expect to see about 5,000 blood clots among the five million recipients of the jab every year — 14 a day, nearly a hundred a week — even if that jab had nothing to do with blood clots whatsoever. Professor Sir David Spiegelhalter, the Cambridge statistician, goes into a bit more detail here if you’re interested.

On that basis, it’s hardly surprising that we might see 37 thromboses over a few weeks of vaccination. In fact, it’s surprising that there aren’t quite a lot more, although that’s probably because a lot of people had blood clots and didn’t associate them with the vaccine.

Asking that simple question — “is this a big number?” — and knowing how to answer it would have saved a lot of bother, for journalists and for policymakers. But it often doesn’t occur to people.

So I’ve written a book with my cousin David (he’s an economist) in an attempt to correct this: How to Read Numbers: A Guide to Stats in the News (and Knowing When to Trust Them). Because while it’s true that the public needs to be literate in order to participate in democracy, it’s not enough to be able to read words. We also have to be able to navigate numbers.

Is 37 a large number of thromboembolic events? Probably not. But if you don’t know to ask “is it a big number?”, you might end up being scared when you don’t need to be – and be put at risk of Covid, if you don’t get vaccinated.

How about 361, as in “361 cyclists were killed on London roads between 1993 and 2017.” Is that a lot? Obviously it’s tragic for each cyclist and their families. But should I feel unsafe when I cycle, or not? Should I be out campaigning for TfL to improve cycle safety? Well, that depends how many cycle journeys there were in London in that time. Was it 4,000 a day? 40,000? 400,000? (It was 437,000, so about one journey in every 10 million ended in tragedy. Only you can say whether that is safe enough for you.)

Or: when we say that police-recorded hate crimes have gone up in the last few years, is that the same as saying that actual hate crimes have gone up? Or has the way that hate crimes are reported and recorded changed? (Probably the latter. If you look at survey data, the number of actual hate crimes has been steadily falling. But police-recorded crimes have gone up, because as a society we take them more seriously: so the public has become more likely to report them, and the police more likely to record them.)

It’s important to be able to navigate these issues. You don’t need to be especially good at maths; you simply need to be aware of how numbers can go wrong. It’s more about a way of thinking than about mental arithmetic. It matters, because we make decisions every day — should we have a bacon sandwich? Should we have a glass of wine? Are we safe walking home? — based on numbers which we have seen in the news.

But journalists, too, need to be clearer. We are performing a public service: we hold governments to account; we inform the public. Despite the stereotypes, journalists in my experience are usually well-intentioned and public-spirited. We’re not just out there hawking for clicks; we pride ourselves on finding out things that other people don’t want to be known, or for explaining complicated things to our readers.

Often, though, journalists aren’t traditionally very good with numbers. One of the things I try to do with my columns for UnHerd is explain bad science stories. Often the numbers in these stories mislead not because journalists are doing it on purpose, but because they’re not scientists or statisticians. Questions like “Is that a big number?” “Are we still measuring the same thing?” “Is the study it comes from any good?” simply don’t occur to them. We’ve thought of a neat way to address this: a statistical style guide.

Most publications have style guides. At the Telegraph, the style guide was very concerned with the correct terms of address for aristocracy. (“The Duke of Bedford’s son is the Marquess of Tavistock. Lord Tavistock’s elder son, if he has one, can use the third title of the Duke, and he therefore is Lord Howland.”)

The style guide at BuzzFeed, where I worked after that, was much more concerned with whether or not to hyphenate “butt-dial”, “circle jerk” or “douchebag” (all thus).

The Sunday Sport’s style guide, meanwhile, is extremely particular about how to write “bellend”. (“BOLLOCKS: Full out in copy and in headlines. WANK: Full out in copy, w**k in headlines. BELLEND: One word, full out in copy and headlines.)

Having a consistent style is useful; it helps publications maintain a clear identity. But as far as we know, no publication has a style guide for numbers. Not just in terms of when to write 100 and when to write “one hundred”, but: how should you present statistics in order to avoid misleading your readers?

So we have put together a list of 11 points that we think will help. They’re in our book. We don’t pretend that they’re the final word, and we’re keen to find out how we might improve them. For instance, Evan Davis, the radio presenter, tells us that he thinks we should explain in words that there is uncertainty around an estimate, or that a study is not very robust, rather than explicitly including confidence intervals and sample sizes.

We do think, though, that if journalists follow our guide, we’ll end up with a better national discussion. People will have a better understanding of the numbers in the news, and therefore make more informed decisions. It might also prevent governments from making insane decisions about perfectly good vaccines, putting thousands of lives at risk in the middle of a pandemic.

 Our Statistical Style Guide


1) Put numbers into context

Ask yourself: is that a big number? If Britain dumps 6 million tons of sewage in the North Sea each year, that sounds pretty bad. But is it a lot? What’s the denominator? What numbers do you need to understand whether that is more or less than you’d expect? In this case, for instance, it’s probably relevant that the North Sea contains 54 thousand billion tons of water.

2) Give absolute risk, not just relative

If you tell me that eating burnt toast will raise my risk of a hernia by 50%, that sounds worrying. But unless you tell me how common hernias are, it’s meaningless. Let readers know the absolute risk. The best way to do this is to use the expected number of people it will affect. For instance: “Two people in every 10,000 will suffer a hernia in their lifetime. If they eat burnt toast regularly, that rises to three people in every 10,000.”

3) Check whether the study you’re reporting on is a fair representation of the literature

Not all scientific papers are born equal. When CERN found the Higgs boson, or LIGO detected gravitational waves, those findings were worth reporting on in their own right. But if you’re reporting on a new study that finds that red wine is good for you, it should be presented in the context that there are lots of other studies, and that any individual study can only be part of the overall picture.

4) Give the sample size of the study – and be wary of small samples

A drug trial which has 10,000 subjects should be robust against statistical noise or random errors. A psychological study looking at 15 undergraduates and asking whether washing their hands makes them feel less guilty is much less so. It’s not that small studies are always bad, but they are more likely to find spurious results, so be wary of reporting on them.

5) Be aware of problems that science is struggling with, like p-hacking and publication bias

Journalists can’t be expected to be an expert in every field, and it’s hard to blame them for missing problems in science that scientists themselves often miss. But be aware of the various ways that scientists can chop up data to make it look as though there’s something there when there isn’t — or to quietly hide results that don’t support their hypothesis. Also, if a result is surprising, that might be because it’s not true. Sometimes science is surprising, but most of the time, not very.

6) Don’t report forecasts as single numbers. Give the confidence interval and explain it

A lot of the time, the media will report on forecasts and models of the future. For instance, each year the Office for Budget Responsibility will make an economic forecast for how much the economy will grow. Or in early 2020, statistical modellers made forecasts for how many people the Covid-19 pandemic would kill.

If they said “Without a lockdown, Covid-19 will kill 250,000 people in Britain,” they didn’t mean that it would kill exactly 250,000. Instead, that was a best guess, in the middle of a wide range of uncertainty: they might say, for instance, that they were 95% sure that the true figure would fall between 100,000 and 400,000. That is the “confidence interval”.

Often, though, when the media reports on forecasts and models, they just give the central, best-guess estimate, which makes them sound more precise than they are. When reporting on forecasts and models, make sure you give the confidence interval, not just the central estimate.

7) Be careful about saying or implying that something causes something else

Lots of studies find correlations between one thing and another — between drinking fizzy drinks and violence, for instance, or between vaping and smoking weed. But the fact that two things are correlated doesn’t mean that one causes the other; there could be something else going on. If the study isn’t a randomised experiment, then it’s much more difficult to show causality. Be wary of saying “video games cause violence” or “YouTube causes extremism” if the study can’t show it.

8) Be wary of cherry-picking and random variation

If you notice that something has gone up by 50% between 2010 and 2018, have a quick look — if you’d started your graph from 2008 or 2006 instead, would the increase still have looked as dramatic? Sometimes numbers jump around a bit, and by picking a point where it happened to be low, you can make random variation look like a shocking story. That’s especially true of relatively rare events, like murder or suicide.

9) Beware of rankings

Has Britain dropped from the world’s fifth-largest economy to the seventh? Is a university ranked the 48th best in the world? What does that mean? Depending on the underlying numbers, it could be a big deal or it could be irrelevant. For example, suppose that Denmark leads the world with 1,000 public defibrillators per million people, and the UK is 17th with 968. That isn’t a huge difference, especially if you compare it with countries that have no public defibrillators. Does being 17th in this case mean that the UK health authorities have a callous disregard for public emergency first-aid installations? Probably not. When giving rankings, always explain the numbers underpinning them and how they’re arrived at.

10) Always give your sources

This is key. Link to, or include in your footnotes, the place you got your numbers from. The original place: the scientific study (the journal page, or the page), the Office for National Statistics bulletin, the YouGov poll. If you don’t, you make it much harder for people to check the numbers for themselves.

11) If you get it wrong, admit it

Crucially – if you make a mistake and someone points it out, don’t worry. It happens all the time. Just say thank you, correct it, and move on.

And if you agree with all that, then join our campaign…

How to Read Numbers: A Guide to Stats in the News (and Knowing When to Trust Them) is published on Thursday.