May 14, 2020 - 4:14pm

A couple of weeks ago, the US FDA gave the Ebola drug remdesivir an emergency licence to be used to treat Covid-19. It found that patients got better 31% faster with it than with a placebo. Sounds good, right? Well: perhaps not.

Imagine you’re doing a study. Say you want to find out whether teaching outdoors in all weather improves children’s school results. You take 60 children, and you randomly assign them to two classes, “outside”, and “inside”, a control group. You decide you’re going to look at their GCSE results at the end and see who does better.

At the end of the year, they’re pretty similar. You’re disappointed. But you’ve also measured a bunch of other things: percentage of homework assignments handed in; teacher feedback on behaviour; attention span, whatever. You go through all those, and you notice that on one measure — say, pupils’ reported satisfaction with class — the outside class does noticeably better.

The effect is “statistically significant”: that is, you’d expect to see a result that big by chance less than one time in 20 (written as “p<0.05”). So you hand in your report to the Department for Education saying “children enjoy class more if they have lessons outside”. Is that right?

Well: you don’t know. You divided your children up at random, but the classes could be different. If one happened to have more of the bright kids, then you might find that one did better at GCSE results. If one of them had the goodie-goodies, you might find it did better on behaviour. And if one of them happened to have more cheerful types, you might find they enjoyed the class more. And it would have nothing to do with whether you taught them outside or inside.

What’s more, crucially, the more things you look at, the more likely you are to get that one-in-20 coincidence and get a “statistically significant” result from chance. This is called “outcome switching”, and it’s best illustrated by this XKCD comic.

According to the Oxford Centre for Evidence-Based Medicine, exactly this sort of outcome switching has gone on with remdesivir. Initially, the study was going to look at mortality and the percentage of patients who needed ventilators. But 13 days before the study’s release, they changed it, and started looking at how long it took patients to recover. There were 27 more outcomes they looked at; they only reported time to recovery and one other “treatment-related improvements”. There was no improvement on the two outcomes they were initially planning to use, including mortality.

The remdesivir paper is not yet available, so we can’t assess it. Maybe they had very good reasons for switching outcomes. And they report a very significant result — p<0.001, not just p<0.05.

But undeclared outcome switching like this looks bad, and — more importantly — there’s no evidence that it saves lives, which is ultimately what we care about: another paper also found no improvement.

This isn’t harmless. The US government is working to make remdesivir available; the WHO is apparently now in talks to do the same. If it doesn’t work, it’s taking resources away that could be used to research or provide drugs that do. Just because we’re in a crisis doesn’t mean we can lower the standards of science — we need them more than ever. And one of the key things is pre-registering studies so they can’t switch the outcomes like this.

Tom Chivers is a science writer. His second book, How to Read Numbers, is out now.