July 7, 2020

Imagine you’re worried about a disease. (Shouldn’t be hard, at the moment.) You’re in charge of some sort of community; I dunno, a school, and there’s a real disease going around, and you want to be able to test for it.

Someone tells you that there’s a chemical found in school wallpapers that makes the disease more likely to spread. But, good news, they have a test which they can run, which detects the levels of the chemical, and then — if levels are high — they can strip the wallpaper for you. Great! You get them to run the test, they find the chemical, they strip the wallpaper; they run the test again, no chemical.

But then you learn that lots of other local schools have had the same treatment, and in the months afterward, some of them get the disease and some of them don’t; the results of the test, and whether or not the wallpaper was subsequently stripped, doesn’t seem to have any effect on whether or not the disease spreads in the school.

So you say to the people who did the test, hey, what’s going on here? And they say “The disease is real! Are you suggesting the disease isn’t real?” No, you say, I know the disease is real, I’m just saying that the test you’ve run and the treatment you’ve administered doesn’t seem to have any relation to whether the disease comes or not. “No, the disease is definitely real,” they tell you. Presumably you would be unimpressed.

On which note, there was an extraordinary programme on Channel 4 recently, The School That Tried To End Racism. It took a bunch of 11-year-old kids at a school and made them take an apparently scientific test to detect unconscious racial bias.

The test finds that most of the children — 18 out of 24, in a class that’s 50% non-white! — “felt an unconscious bias towards white people”. The test, they say, “is now widely accepted as an accurate measurement of unconscious racial bias”, something that understandably shocks the children. 

They then make the kids embark on a three-week programme to reduce their unconscious bias. The kids are separated into “affinity groups”, i.e. white children in one room, non-white children in another, to talk about their experiences in their racial groups; a mixed-race child strongly resents being forced to choose which room she goes in. One child ends up in tears.

Then, three weeks later, they take the test again and most of the class is apparently “near to the neutral position: very little or no unconscious bias”.

It’s powerful television, and incredibly uncomfortable to watch. But there are two huge problems with it, and they’re the same problems in our imaginary school, above: disease is real, but the test and the treatment are not.

Problem number one, the “implicit associations test”, or IAT, on which it is based, is essentially useless as a measure of prejudice; it’s certainly not “widely accepted as an accurate measurement of unconscious racial bias”.

And problem number two, the “unconscious bias training” used to reduce the impact of this unconscious prejudice has not been shown to work, and in fact it’s just possible that it may even have a negative effect. 

This has been known for some years now, but the two tools — the IAT, and the training — have now become embedded in the public consciousness. So much so, in fact, that Sir Keir Starmer — the leader of the Labour Party — announced on LBC that he was going to undergo unconscious bias training, after he was criticised for his use of language around the Black Lives Matter protests, and in fact to introduce it for all Labour Party staff. “I think everybody should have unconscious bias training,” he said. “I think it is important.”

To drum the analogy home one more time: racism, like our imaginary disease, is real; and it would be astonishing if it isn’t at least partly unconscious, or subconscious, or related to some sort of attitudes not fully available to our conscious minds. “Unconscious bias” is a real thing. 

But these two tools for measuring and reducing it do not do the job they are supposed to do. It is pernicious that Channel 4 and Sir Keir are both supporting and publicising them in this context; our social battle against racism will not be helped by using unevidenced tools.

The IAT is a simple and rather clever idea; if you watch the Channel 4 programme you’ll see what I mean, or read this piece I wrote earlier this year. It measures whether people find it easier to associate positive words (“happy”, “wise”, “beautiful”) with white faces and negative words (“pain”, “angry”, “stupid”) with black faces, or vice versa, by measuring their reaction times. Most people (including black people) end up having a “preference for white faces over black faces”.

But there are two problems. One, if you test someone with the IAT twice, you’re likely to get two very different scores; it has a low “test-retest validity”. If you measure my height on Monday and then again on Tuesday, you’ll get almost the same result both times; but if you do the same with the IAT, it’s very possible that it’ll say I’m strongly prejudiced one day and not at all the next. (Which makes the children doing better on their second test less surprising or meaningful, although the odds of a total fluke are much lower when it’s 24 of them being tested.)

Second, whether you score highly or not on the IAT does not correlate with whether or not you behave in prejudiced ways in real life. It is essentially useless as a predictor of individuals’ actual, real-world behaviour. 

The IAT’s own creators say that it should not be used to diagnose levels of prejudice in individuals; they do think it can be useful at population levels, although even then it throws up weird effects, such as all categories of women being more prejudiced against women than all categories of men in the 2016 US presidential election. The idea that the average male Trump voter is less prejudiced against women than the average female Clinton voter is … well, it’s not inherently ridiculous, but it would require a huge redrawing of society’s understanding of what prejudice is and who has it.

I know “internalised misogyny” is a crucial feminist idea, but (to me, at least) it seems deeply weird to think that it would be internalised to that degree. Either way, it doesn’t make it all right to use it on a class of children to tell them that they’re prejudiced against black people.

The idea of training people so that their unconscious bias goes away is on equally shaky ground, if not shakier. A 2017 Equality and Human Rights Commission report found, firstly, that most workplace training uses tests like the IAT to diagnose individual levels of bias, which as we’ve seen is not a good idea. It also found that while bias training can improve IAT scores, there was “limited” evidence for behavioural change, and “potential for back-firing effects”, ie they may make the problem worse.

A 2019 meta-analysis found no evidence of backfire, but also “trivial” impact on behavioural change and only a small effect on IAT scores. This systematic review did find some evidence of effect, but only looked at the impact on implicit bias scores, not real-world behaviour. This one says basically that all the research on the topic is complete crap and not to be trusted.

I don’t know which particular flavour of unconscious bias training Sir Keir and the Labour Party will undergo; I gather it will involve “The meaning of unconscious bias; its impact on people we work with; common types; recognising and challenging personal biases; practical tips to uncover and challenge bias”, but I don’t know which particular practitioners or methods they’ll use. Some methods may be more well-evidenced than others. 

Similarly, I can’t find any research on the “affinity groups” model practised in the Channel 4 programme; it was led by a woman called Mariama Richards, who has been doing something similar in US schools since at least 2015. The article about it talks about the work of Claude Steele on “stereotype threat” — the idea that minorities “fail to achieve their potential because they internalise stereotypes” about their group. 

But the intervening years have not been kind to stereotype threat research; many of its most celebrated findings have failed to replicate. That may not be crucial to Richards’ work but I can’t find any research directly related to it, so it’s all I’ve got to rely on. 

For the record: I find the idea of telling children that they are prejudiced against black people, on the basis of an at best highly controversial and at worse flatly unevidenced test, deeply uncomfortable. Dividing them up into racial groups, to at least one child’s obvious distress, likewise. If the benefit was well-evidenced I could understand it, but in the absence of that – and, in fact, given the small but real possibility that it could exacerbate prejudice – it seems needlessly cruel. Putting the whole thing on telly seems doubly so.

Again: no one sensible is suggesting that humans are not unconsciously biased (least of all me). Of course we’re biased; it’s probably easiest to think of our brains as a collection of biases, loosely strung together. And of course we’re biased towards and against racial and ethnic groups, different sexes, different classes, in societally relevant ways. That seems so obvious, just from observing society or even honest introspection, that it hardly needs saying.

But the IAT, when used like this, is not simply that obvious truth in vaguely scientific language. It is a measure of your score on this specific test, which does not seem to be related (at least at the individual level) to any actual detectable real-world prejudice. And unconscious bias training, much of which seems to be aimed at improving people’s score on the IAT, seems to lack any good evidence for its effectiveness (and, similarly, all that much good evidence against it being harmful).

As I’ve written before, lots of major companies use the IAT and implicit bias training. I think that’s a waste of money and employee time, but it’s the companies’ money; as long as it’s not actively making things worse then whatever. It’s sort of the same for the Labour Party, although I’m not sure how its members would feel if they knew their money was going to pay for unevidenced training like this.

But the equation changes when schools are doing it to children, especially with cameras inside their race-segregated groups, and their real names and faces on national television. This should require a higher standard of evidence than “it probably isn’t harmful”. 

A retired clinical psychologist wrote to me about it, suggesting that it probably breaks British Psychological Society guidelines on research on children; certainly the ethical questions over whether it is okay to test these controversial and poorly evidenced tools on children, with no anonymising, are profound.

Channel 4 got back to me with a statement, saying: “The course followed by the school was developed by academics and educators and is based on similar schemes run in US schools. Whilst some academics argue that this test is not perfect, many agree it is a good indicator of implicit bias and it continues to be widely used both in academia and in the real world.” I think it’s worth noting that the “some academics” who “argue that it is not perfect” include the test’s own creators, who say that it should not be used to assess individual prejudice in the way this programme uses it.

Go back to our original analogy. The disease is real; racism is real, unconscious bias is real. We really do want to reduce prejudice against ethnic groups, because it really does affect people’s life chances. I’m happy to take all that as read, and while I’m really heartened by the fact that on most measures, Britain is slowly getting less prejudiced, we can probably all agree that it’s not happening fast enough.

But the IAT does not tell us anything about whether a class, or a person, is more likely to be prejudiced than anyone else. It does not do what unconscious-bias-training courses are selling it as doing. And the courses themselves are not supported by good evidence, even to the point that we can be sure they’re not doing more harm than good. If you want to protect your class from this disease, then you need a test that works and a treatment that’s effective; otherwise you’re just stripping the wallpaper.