June 8, 2021

During Dominic Cummings’s testimony the other week, someone had a thought: there just isn’t much capacity in the UK Government for thinking about “mad shit that might never happen” but which would be terrible if it does.

Something like a pandemic, for instance. A hundred years ago, in 2019, the idea that the world would soon be brought to a standstill by a virus would have seemed like a science fiction movie. Sure, a few Cassandras in the infectious disease community might have warned against it, but for most of us, it was just not a serious consideration.

But now we have had an obvious corrective to that attitude. Pandemics might be unlikely in any given year, but if there’s only a 1% chance per year of something terrible happening, it’ll probably happen in your lifetime.

Only preparing for another pandemic, however, would be getting ready to fight the last war. We need to think about what other horrible disasters we might expect in the coming decades. Luckily, last week, a think tank called the Centre for Long-Term Resilience released a report into the most likely extreme risks that humanity faces, and what Britain in particular can do to prepare for them.

Covid-19 has cost millions of lives and tens of trillions of dollars so far — but it could have been a lot worse. The extreme risks that the report is talking about range from those that kill 10% or more of the total human population, to those that kill every last one of us. And it suggests that the two most likely causes of a disaster of this magnitude are bioengineered pathogens, and artificial intelligence.

Even now, that might sound like science fiction, especially the idea of AI. We picture AI going wrong as being like The Terminator: an intelligence achieving consciousness and rebelling against its masters. But that’s not what we ought to worry about, and to illustrate that, I want to tell you about a marvellous little paper published in 2018. It described building AIs through digital evolution.

Digital evolution is exactly what it sounds like: a bunch of machine-learning programs are asked to come up with solutions to some problem or other; then the ones that do best at solving that problem are “bred”, copied repeatedly with small, random variations. Then those new copies try to solve the problem, and the ones that do best are again bred, and so on, for thousands of generations. It’s exactly the same process, of replication, variation and competition, as biological, “real”, evolution.

The paper was basically a series of anecdotes about how that process had gone wrong in surprising ways. In each one, essentially, the AIs had learnt to game the system, often with disastrous results.

For instance, one task involved locomotion, featuring a 3D simulated world with little 3D avatars. The avatars were told to travel from point A to point B as quickly as possible, the programmers wanting the system to discover clever ways of travelling: would it breed snake-like belly-slithering? Hopping like a kangaroo?

But what actually happened was that, “instead of inventing clever limbs or snake-like motions that could push them along (as was hoped for), the creatures evolved to become tall and rigid. When simulated, they would fall over.” Essentially, it created a very tall tower with a weight on the end, standing on point A. When the simulation started, the tower fell over in the direction of point B. It had achieved the task, but … not exactly how its creators hoped.

There were other, rather scarier ones. One was given some text files, and told to create new text files as similar as possible to the originals. The various algorithms were trying, and doing moderately well, when suddenly lots of them started returning perfect scores all at once — because one algorithm had realised that if it deleted the target files, it (and any other algorithm) could just hand in a blank sheet for a 100% score.

And one was supposed to play a version of noughts and crosses on an infinitely large board. But it realised that if it played a move hundreds of billions of squares away from the centre, its opponents would have to try to represent a board billions of squares across in its memory: they couldn’t do that, and crashed, so the cheating algorithm was declared the winner.

The point is that when you give an AI a goal, it will try to achieve exactly that goal. Not what you wanted it to do; not what any halfwit would obviously understand you meant it to do. Just what you tell it to do.

When people worry about AI going wrong in disastrous ways, that’s what they’re worrying about. Not about The Terminator; not about AI “going rogue”. They worry about AI doing exactly what you told it to do, and being extremely good at doing what you told it to do, when “what you told it to do” is not what you actually wanted.

These AIs were just toys; when they go wrong, it’s funny. But if you have a much more powerful AI, with commensurately greater responsibilities — running a power grid, for instance,  driving vehicles or commanding military ordnance — it would be less comical. When you have a really, really powerful AI, it could be disastrous. In an illustrative but perhaps unrealistic example, if you gave an enormously powerful AI the goal of “ridding the world population of cancer”, it might come to realise that biochemistry is difficult but that hacking the nuclear codes of ex-Soviet states is quite easy, and a world without humans is a world without humans with cancer. You might think you could just switch them off, but since the AI would know it would be less likely to achieve its goal if it was switched off, you might find that it resisted your efforts to do so.

The Centre for Long-Term Resilience report, co-authored by Toby Ord, a professor at the University of Oxford’s Future of Humanity Institute and the author of a marvellous book about existential risks, argues that right now we have a rare opportunity. After the Second World War, both the world and Britain took advantage of the disaster to build new institutions. In the UK, we created the NHS, and a comprehensive welfare state based around a system of national insurance. Worldwide, we helped build things like the World Bank, the UN, the International Monetary Fund. This was possible, they argue, because the scale of the recent tragedy was fresh in people’s minds, and there was a willingness to take drastic, difficult steps to preserve long-term peace and stability.

Now, they argue, we have the opportunity to build similarly vital new institutions in the wake of the Covid-19 pandemic. There will, Ord et al hope, be enough public will now to get ready for the next disaster, even though government and democracy in general is not brilliant at thinking about long-term risks. The risk, of course, is that we will prepare brilliantly for the thing that has already happened, getting ready to fight the last war, and Ord says “We need to look beyond the next coronavirus.”

A few years ago I wrote a book which was (partly) about existential risk. The people I spoke to said, just as Ord et al’s report does, that the two things  most likely to cause the human species to go extinct are 1) bioengineered pandemics and 2) artificial intelligence. (Climate change and nuclear weapons are very likely to cause awful disasters, but less so to literally drive us extinct.)

The world shouldn’t need too much convincing of the possibility of a bioengineered pandemic, not least because there is growing support for the “lab leak” hypothesis. But even after Covid, it may be that AI seems too much like science fiction. People are happy to accept that it will cause near-term problems, like algorithmic bias, but the idea that it could go really, disastrously badly wrong in future is harder to swallow.

But we should have learnt from the pandemic that it is worth preparing for unlikely, but plausible, disasters, especially as AI researchers don’t think it’s that unlikely. Surveys in 2014 and 2016 asked AI researchers when they thought the first “human-level” AI — an AI capable of doing all the intellectual tasks that humans can do — will be built. The median guess was that there’s a 50% chance by 2050 and a 90% chance by 2075. And what was really interesting was that those same researchers thought there was about a one in six chance that when true AI does arrive, the outcome will be “extremely bad (existential catastrophe)”. That is: everybody dead.

I’m faintly sceptical of the surveys — only about a third of people responded to them, and they may not be representative of AI researchers in general — but even if the results are off by an order of magnitude, it means that AI experts think there’s a greater than 1% chance that the world will be devastated by AI in my children’s lifetimes. Certainly I spoke to several AI researchers who thought it was worth worrying about.

Ord et al have some simple prescriptions for how to be ready for the next disaster. On future pandemics, they suggest creating a national body dedicated to monitoring and preparing for biological threats; and they suggest improving “metagenomics”, sequencing technology which takes a sample from a patient and sequences the DNA of every organism in it, before comparing it to a database of known pathogens. If it detects a dangerous or unknown one, it will alert the medical staff. The UK’s world-dominant position in sequencing puts us in a powerful position here.

And for AI they make some relatively commonsense suggestions, like investing in AI safety R&D and monitoring, bringing more expertise into government, and (and I have to say this does seem very wise) keeping AI systems out of the nuclear “command, control and communication” chain.

More generally, they suggest setting up bodies to consider long-term extreme risks to the UK and the world, such as a Chief Risk Officer and a National Extreme Risk Institute, to think about these things on a longer timescale than the democratic cycle allows. All their ideas add up to less than £50 million a year, while in contrast the pandemic has cost the UK about £300 billion so far; 6,000 times as much. If there’s even a small chance of reducing the impact of future disasters, it is a worthwhile bet to make.

My own feeling is that they could do more to bring in the UK’s almost unrivalled expertise on this into government. Ord and his colleagues at FHI, such as the philosopher and AI researcher Nick Bostrom, are just one of several groups here who focus on the long-term future of humanity. Only the US has access to anything like as much knowledge. Dominic Cummings, for all his many faults, seemed to realise this.

As Ord et al say: there’s a window of opportunity to do this stuff now, while it’s all fresh in our minds.

A noughts-and-crosses-playing AI that makes its opponents crash is, as I say, kind of funny. But at a fundamental level, a much more powerful AI that can command military theatres or civil engineering projects will be similar: it will only care about the things we explicitly tell it to care about. I don’t think it’s silly science fiction to worry about it, or about bioengineered pandemics. We have a chance, over the next year or so, to make those disasters a little bit less likely. We should take it.