Around this time last year the AlphaZero AI system taught itself how to play chess in just four hours. It then thrashed the best chess player in the world (itself a computer program) over the course of a hundred games.
As a demonstration of the power of machine learning, AlphaZero (and similar programs) caused some alarm – see, for instance, the Shadow Foreign Secretary’s take on the issue for UnHerd. If a machine mind can become a chess grandmaster between breakfast and lunch, then what might it achieve by dinner-time, or next week or, gulp, next year?
Well, here we are twelve months on, and human beings are still in charge of the planet. Whatever AlphaZero is up to these days, it isn’t ushering in the Singularity.
In a short post for MIT Technology Review, Karen Hao explains why game-playing AI systems are limited in what they can achieve:
“[Reinforcement learning] is a category of machine learning techniques that uses rewards and penalties to achieve a desired goal. But the benchmark tasks used to measure how RL algorithms are performing—like Atari video games and simulation environments—don’t reflect the complexity of the natural world.
“As a result, the algorithms have grown more sophisticated without confronting real world problems—leaving them too fragile to operate beyond deterministic and narrowly defined environments.”
A game like chess has clear rules and defined objectives. Furthermore, the success, or otherwise, of any sequence of moves can be assessed in terms of interim and final game outcomes. This means an AI chess player can generate vast quantities of labelled data by playing games against itself – in which winning patterns of moves can be recognised and ‘learned’ in a process of trial and error.
Tasks in the real world, however, are not like a game. Even if an objective is clearly defined, the rules for achieving it may be neither obvious nor unchanging. Through trial and error, an AI system may be able to work out what the rules are (or at least a rudimentary version of the rules), but to recognise right and wrong answers (and hence the patterns of rightness and wrongness from which rules can be derived) it usually needs an external source of labelled data.
For these reasons, and others, AI mastery of games does not imply competence, or potential competence, in anything else. So, no danger of AlphaZero and its ilk taking over anytime soon. A more immediate concern, however, is that in the process of computerising our economy and society, we are using technological tools to treat life as if it were a game.
The Latin word for “I play” is ludo. In his book The Black Swan, Nassim Nicholas Taleb describes what he calls the ludic fallacy – i.e. “the misuse of games to model real-life situations.” He sees the ludic fallacy at work in models of risk management, such as those that played such a disastrous role in the financial crash of 2008. By relying on game-like assumptions of the probability of certain things happening – i.e. those derived from pre-determined, non-extreme parameters – financiers were left clueless as to ‘black swan’ risks – i.e. the extremely rare, but highly consequential events that have such a defining impact on the real world.
In a different way, the ludic fallacy is also at work in the use of ‘gamification’ by businesses to influence the behaviour of employees, suppliers and customers. In an insightful Guardian long-read, Sarah Mason provides a succinct definition:
“Simply defined, gamification is the use of game elements – point-scoring, levels, competition with others, measurable evidence of accomplishment, ratings and rules of play – in non-game contexts. Games deliver an instantaneous, visceral experience of success and reward, and they are increasingly used in the workplace…”
Drawing on her experience as a driver for a ride-hailing company, Mason describes how tech companies (especially those who coordinate the provision of a service through a digital ‘platform’) use gamification in place of traditional employee management:
“Every Sunday morning, I receive an algorithmically generated ‘challenge’ from Lyft that goes something like this: ‘Complete 34 rides between the hours of 5am on Monday and 5am on Sunday to receive a $63 bonus.’…
“Behavioural scientists and video game designers are well aware that tasks are likely to be completed faster and with greater enthusiasm if one can visualise them as part of a progression towards a larger, pre-established goal.”
Mason adds that “it is not uncommon to hear ride-hailing drivers compare even the mundane act of operating their vehicles to the immersive and addictive experience of playing a video game or a slot machine.”
That’s all the creepier given that the gamified instructions and incentives are generated algorithmically.
Of course, the relationships of the traditional workplace can also have an inhuman, unreal quality. Indeed, it’s hard to think of a more dehumanising piece of business jargon than ‘human resources’. But face-to-face contact, or any means of communication in which there’s a living, breathing person on both sides, at least allows for the possibility that common-sense, compassion and intuition might be applied the situation – especially if no one’s playing games.