What you need to learn about machine learning

It’s now twenty years since the Deep Blue chess computer beat Garry Kasparov, one the greatest ever human players of the game.

Chess playing software has had two decades to get better. Until last week, the best computer player in the world (and therefore the best player period) was a programme called Stockfish 8. But that was before it met AlphaZero, which played 100 games against the reigning champion, winning or drawing all of them.

What’s truly remarkable about AlphaZero’s achievement is that it taught itself how to play chess to this world-beating standard in just four hours. To many people, this is deeply disconcerting – if an AI system can pull off such a feat, what can’t it do?

Almost everything, as it happens — but that doesn’t mean it isn’t time to get smart about Machine Learning (which is a better focus for our interest than the fuzzier concept of Artificial Intelligence).

An excellent place to start is a recent Harvard Business Review briefing by Erik Brynjolfsson and Andrew McAfee. The authors pack a lot into a few pages, but a few key points stand out.

First of all, just because a Machine Learning (ML) system is good at one thing it doesn’t mean that it is good at everything:

“…ML systems are trained to do specific tasks, and typically their knowledge does not generalize. The fallacy that a computer’s narrow understanding implies broader understanding is perhaps the biggest source of confusion, and exaggerated claims, about AI’s progress. We are far from machines that exhibit general intelligence across diverse domains.”

With that out of the way, we can get what is truly powerful about ML:

“The most important thing to understand about ML is that it represents a fundamentally different approach to creating software: The machine learns from examples, rather than being explicitly programmed for a particular outcome.”

ML overcomes the biggest constraint on our ability to teach computers how to do things that humans can do – which is that, in many cases, we don’t know how we do the things we do. Specifically, we can’t write it down as a list instructions that can be programmed into a computer.

Of course, many of the things that people can do require things that computers have no capacity for anyway – like consciousness, creativity and imagination. What computers can do extremely well, however, is process lots of data really fast – and that opens the way to a different kind of learning:

“Artificial intelligence and machine learning come in many flavors, but most of the successes in recent years have been in one category: supervised learning systems, in which the machine is given lots of examples of the correct answer to a particular problem…”

If given enough examples of things that match certain pre-set criteria, they can find the commonalities between them – and apply that learning to identify the presence or absence of those patterns in new and unlabelled sets of data. For instance, the rapid progress made in image recognition – e.g. identifying a human face in a photograph – comes from feeding ML systems with labelled photos. You may have unknowingly labelled some of these photos yourself, for instance when uploading your snaps to a social media account or when asked by a website to complete an image recognition task as part of an access procedure. In proving that you’re not a robot, you’ve may have been teaching one.

The most advanced ML systems are capable of deep learning – meaning that the more examples that are fed into them, the more they learn:

“Deep learning algorithms have a significant advantage over earlier generations of ML algorithms: They can make better use of much larger data sets. The old systems would improve as the number of examples in the training data grew, but only up to a point, after which additional data didn’t lead to better predictions…”

You will have noticed just how keen tech companies are to get your data. It isn’t just so that they (and their clients) can sell you stuff. They also want to feed their pet AIs, which get stronger with every bucket of data they eat.

AlphaZero, by the way, is owned by Google.