August 10, 2018   3 mins

In a world where we have access to more information than any of us has time to verify, how do we decide what is both true and useful?

The answer, is that we make prior assumptions about what (and who) is trustworthy and relevant. Furthermore, we tend to do so collectively – subscribing to the shared assumptions of the groups we want to be part of.

Fair enough. We need those filters, otherwise we’d be overwhelmed. The problem though is when the filter become so restrictive that it only allows through the information and ideas that confirm existing assumptions. From that point, it ceases to be a defence and becomes a cage – a bubble within which different perspectives have no place and new understanding becomes impossible.

Such bubbles are typically constructed from linguistic components – facts, theories, jargon, slogans, arguments, talking points, memes, cultural references. However, the thickest bubbles are made out of numbers not words.

While words are open to interpretation, numbers are an expression of objective truth: 2 plus 2 equals 4 and nothing else. We may fear the dark arts of statistical manipulation, but what could be more objective than a direct, unfaked observation?

Unfortunately, there are pitfalls there too. The point is made in an eye-opening piece by Aaron E Carroll for the New York Times. It concerns that most wholesome of corporate activities – the workplace ‘wellness’ programme.

By now, we all know that sedentary lifestyles aren’t good for us – and especially not in combination with mental stress. This, in turn, is bad for productivity, employee retention and the cost of providing health insurance. Hence the rationale for workplace-based schemes to promote healthier lifestyles.

But do they work?

“These… can offer screening for a variety of reversible conditions; access to weight-loss programs or gyms; encouragement and support; and sometimes even chronic disease management. Many of the analyses of these programs have shown positive results.

“Almost all of those analyses are observational, though. They look at programs in a company and compare people who participate with those who don’t”

Surely, that’s a perfectly fair test – a direct, unmanipulated measurement of the difference that participation makes.

There’s a catch though:

“The most common concern with such studies is that those who participate are different from those who don’t in ways unrelated to the program itself. Maybe those people participating were already healthier. Maybe they were richer, or didn’t drink too much, or were younger. All of these things could bias the study in some way.”

Of course, one can control for factors like age, sex, income etc. This requires a statistical trick or two, but as long as the effect is to compare like-with-like (e.g. slightly overweight middle-aged men with other slightly overweight middle-aged men) who could object?

Only, there’s another catch:

“…we can never be sure that there aren’t unmeasured factors, known as confounders, that are changing the results.”

When it comes to getting fit, some people are just more motivated than others – just the sort of people you’d expect to volunteer for a wellness programme. Motivation, being an internal state of mind, is hard to measure objectively and thus control for – meaning that the participating group is likely to be biased to those most inclined to make the most of it.

Then there’s bias on the part of the organisers. If it’s your job to make a company wellness programme a great success, who are you going to encourage to take part – the health-conscious individual looking to get back in shape or the unrepentant, donut-scoffing flubba-wubba sat at the next desk?

There is a solution though: the randomised controlled trial – where participation is randomly assigned.

Carroll describes one such trial – of a wellness programme among university employees. This found no significant difference in outcomes between the intervention group and the control group. But here’s the twist:

“…the researchers also took the time to analyze the data as if it were an observational trial. In other words, they took the 3,300 who were offered the wellness program, then analyzed them the way a typical observational trial would, comparing those who participated with those who didn’t.”

When the trial was ‘de-randomised’ – i.e. based on simple observation of participants versus non-participants the hoped for differences in outcomes did appear. Furthermore, most of those differences remained even when factors like age and prior healthiness were controlled for.

Clearly, selection bias is a powerful thing.

There are some big implications here for public policy.

The idea that public resources should be allocated on the basis of evidence of effectiveness is a perfectly sound one. Furthermore, with the advent of ‘big data‘, the state is acquiring the ability to observe the outcomes of government intervention in greater detail than ever before.

However, the danger is that with so much data to play with, policy-makers will overlook the confounding factors of selection bias and other epistemic pitfalls.

Having left behind the dark age of pure guesswork, policy-making is set to enter a new age of false confidence.


Peter Franklin is Associate Editor of UnHerd. He was previously a policy advisor and speechwriter on environmental and social issues.

peterfranklin_