In this new era, we’re all becoming data nerds, or hobby-level epidemiologists. We’re all suddenly conversant in things like case fatality rates and R0.
It makes for an attractive amateur pastime because lots of the things we are trying to know — such as how many people are infected, or how deadly the disease is — are hugely uncertain. But there’s another problem, which is that the things we measure are affected by the simple fact that we’re measuring them — and that the right things to measure change with every passing day.
For instance, we’re all wondering about the “exit strategy” — how, now that we’re all in lockdown, we’re going to get out of it. It’s going to involve some combination of testing, contact tracing, perhaps (eventually) immunity passports, and hopefully in the not-too-distant future vaccines and treatments.
But it’s also going to involve — probably — see-sawing back and forth between tight controls and more relaxed ones, trying to eke out the cases over months to avoid overwhelming the NHS. And those controls will have to be imposed and relaxed, to some degree, on the basis of metrics.
In the UK, it might look a bit like this. The 16 March Imperial model proposed a sequence of automated triggers: specifically, when the number of ICU cases in a week reaches a certain number, say 100, the lockdown (school closures, social distancing, etc) is imposed; when they drop below another certain number, say 60, they are relaxed. The outcome — hopefully — will be a saw-toothed line on the chart: ICU cases jagging up, coming down, jagging up, coming down, but never breaching the line of health service capacity.
But once you start using ICU beds as a metric, you hit a problem. There’s this thing, “Goodhart’s law”. It’s named after the economist Charles Goodhart, and is usually formulated as “When a measure becomes a target, it ceases to be a good measure.” Goodhart proposed it (in more technical language) when discussing Margaret Thatcher’s economic policies, but it applies everywhere.
Imagine you’re the education secretary of some democracy. Your government has been elected on a platform of education, education, education. You pledged to push lots of money into schools, reduce class sizes and — most of all — improve outcomes.
So you start saying that schools which get more than a certain number of passing grades in their Generic Class Exams will be rewarded, and those which get a lower number will get put into special measures.
Goodhart’s law predicts that once this measure (how many children are passing their exams?) becomes a target (you need 60 children out of every 100 to pass or we’re firing the headteacher), the measure stops being useful. If a school is close to dropping below that magic 60 number, teachers are heavily incentivised to start teaching to the test, or quietly removing children who are likely to fail.
Essentially, the trouble is that you don’t care about your exam results per se: you care about some complex and ineffable, but real, combination of whether schools make children more productive citizens, whether they make them more well-rounded individuals, whether they make them happier, whether they make them smarter, whether they protect them from abuse at home.
But you, the education secretary, won’t get to see that: all you see is the exam results. And so that’s all you can judge it on. So that’s what people will strive to deliver to you, even if it’s not really connected to the stuff you genuinely care about. “Looking for the perfect summary statistic,” someone once said, “is like trying to write a dust-jacket blurb that replaces the need to read the book.”
Goodhart’s law has obvious implications for governments’ response to Covid-19. I spoke to David Manheim, an Israeli data scientist who’s spent much of his career thinking about both Goodhart’s law and infectious disease, about what problems it might cause and how best to mitigate them.
I expected him to say that the problem would be countries minimising case and death numbers somehow. That is no doubt going on, but it wasn’t his main focus. His chief concern was the exit strategy.
At some point in the next few months, countries are going to start relaxing their lockdowns. The UK will probably have some mechanism not unlike the one I described at the start of the piece; the US will likely have some more localised version.
But it is going to be difficult. Lockdown will be unpopular and expensive and will probably cost many elected officials around the world their re-election. Whatever metrics we use, there will be incentives for those carrying out tests and reporting numbers to get and keep those numbers below the magic line.
In the centrally planned UK, with a nationwide strategy and the NHS in charge of testing and reporting, it might be a manageable problem — although you could imagine that when there are 99 people in ICU beds, the next one is put in a hospital bed with a ventilator and 24-hour monitoring that isn’t, for some reason, called “intensive care”. But in the US, with much more latitude for cities to make their own decisions, things could be very different.
Imagine that US cities with no active cases are allowed out of lockdown, but that when five new cases are discovered, restrictions are back in place. “When there are four cases, will we push really hard to do as many tests as possible?” Manheim says. “The people in charge have a lot of reason to want to minimise how bad things look. There’s a lot of pressure to look at those numbers in ways that are more favourable.”
I don’t want to speculate on the possible impacts, or how much worse it could make any second waves, or anything. But it’s likely to be a real problem, and one we need to deal with.
The exit strategy isn’t the only obvious problem caused by Goodhart’s law: it applies everywhere, for instance with resource allocation. Imagine you’re a local hospital. You have 50 ICU beds. But you know that, in an emergency, you can repurpose another 12 beds to act as ersatz ICUs for a short period, or even improvise further and treat 100 people at a time.
But if the NHS comes around asking each authority how many beds they have, to allocate spare central resources, it’s obviously in each hospital’s interest to report the lowest number — 50 — rather than the at-a-pinch stretch figure, so that they can get access to extra funding. And, of course, all the others will be doing the same thing, in a sort of tragedy of the commons.
It would be easy to say that the problem is that we’re treating people like numbers, or reducing complex systems to a few metrics. But the trouble is that it is impossible to run a massive modern economy, or healthcare system, without metrics and simple, algorithmic rules.
In normal times, you want your system to run efficiently, and to make sure that you’re managing all the day-to-day stuff you want managed, and that means that you need to be able to easily keep tabs on what’s going on. That means metrics, and – Goodhart’s law notwithstanding – they can, if well designed, give you better-than-nothing information about your system, and help you manage the things that need managing.
But that leads us to the second problem: that in these rapidly moving times, the metrics you use quickly become useless. For instance, it’s probably very sensible that palliative-care drugs like morphine are carefully controlled. After Harold Shipman, it was obvious that their use could be easily abused, so it makes sense to ensure that their use is precisely measured and accounted for. Most of the time, their use won’t change dramatically, so you can build a system that allows them to reach the places they’re needed fairly efficiently, while also alerting you if there are any sudden, strange changes that you should be aware of.
But in a crisis, that falls apart. Suddenly the demand for the drugs skyrockets, but the system can’t cope: it’s designed to be inflexible, because flexible systems are harder to keep track of. So you end up with doctors begging the Department of Health to relax the legislation so they can stop patients from dying in needless pain.
Manheim says that one doctor in the US told him that, while he’s been desperately trying to source PPE for his staff, he’s had a dozen emails demanding that he complete his recertification training, which he has to do every three months. That’s really important — we need to make sure doctors are up-to-date with the latest medical research and techniques. But right now, he doesn’t have the required six hours spare. If the normal-times algorithm is applied unthinkingly, then he could lose his medical licence.
This isn’t quite the same problem as Goodhart’s law — it’s not that people are gaming the system; it’s that the system is measuring the wrong things, trying to operate in a world which no longer exists. But it’s the same root problem, that metrics only give you a partial view of what is really going on.
The question is what to do to avoid these problems. One that would help you minimise Goodhart’s law, but would come at a cost, would be secrecy: hiding the metrics you’re using to trigger lockdowns. If the people doing the testing don’t know what the system is, they can’t game it.
The obvious problem with that is that transparency is good in democracies. It might, though, in countries like Britain which still have a decent amount of trust in their institutions, be possible to say “we’ll release all the information in six months, but for now we’re keeping it secret”. But perhaps it’s too high a price to pay.
Another, less problematic one, would be to use more complex multi-factor metrics. Just looking at deaths, or case counts, means that people only have to game that one metric; you should always be looking at several.
But the main thing is that in a crisis, you need to rely on judgment more. That sounds like a good thing all the time, but — as we’ve discussed — for a system to be governable at all, it will need metrics, hoops to jump through, visible readouts. But when it’s in crisis, and you don’t have time to cobble together a good way of keeping track of the things you care about, you might be best off trusting the people making the decisions locally.
Lots of them will get things wrong, but they’ll be more able to be flexible than if you are forcing them to jump through the same hoops as in peacetime. You don’t want to be forcing doctors to do their recertification training when they’re desperately intubating patient after patient on an unbroken 28-hour shift.
I think this also has implications for us, the public. All we see are the big headline figures: deaths, case numbers. They’re all we have. But they don’t necessarily tell us the whole story: case numbers are about testing, deaths are recorded differently in different countries. And we don’t only care about them, we care about whether the NHS is getting overwhelmed, and whether other people are dying of preventable diseases because doctors can’t get to them, and excess deaths, and a million other things.
So things like direct comparisons of, say, death numbers between the UK and Ireland or Italy or New Zealand or wherever can tell us some things, but if we focus on them exclusively, then we will encourage a Goodhart’s law situation. The Covid-19 death counts will become the only things that matter, and so the incentives will be to bring them down even at the cost of other things.
All the things we normally rely on for governance — accurate metrics, functioning systems — are currently unavailable; statistics are not going to be as informative as they usually are. We’re going to have to rely even more heavily than usual on expert judgment at every level of the pyramid, and that means that we’re going to have to accept that sometimes people are going to get stuff wrong. We might all be amateur epidemiologists now, but when we make mistakes no one notices; when the real ones do, people die.