The danger of convicting with statistics Courts have a bad history of using probability

What's the probability judges understand Bayes? (Credit: Oli Scarff/Getty)

Bayes Theorem Lucy Letby none Probability Sally Clark Science Statistics

Tom Chivers

May 28, 2024 7 mins

Sally Clark had two sons. Both died within weeks of birth, a year apart, apparently of sudden infant death syndrome (SIDS), sometimes called cot death. SIDS is — mercifully — rare; in England, at the time, it struck roughly one in 8,500 babies. That statistic led to Clark being prosecuted for double murder in 1998, despite there being little to no forensic evidence for her guilt.

A paediatrician, Roy Meadows, called as an expert witness for the prosecution, told the court that the probability of the two deaths happening by chance was one in 73 million: that is, 8,500 times 8,500.

As it happens, that’s not true. This calculation assumes that the deaths are entirely uncorrelated, but we know that SIDS can run in families and be affected by environmental conditions. If you have one case of SIDS in your household, while incredibly rare, you are more likely to have a second; the 73 million figure is orders of magnitude too high. But that wasn’t Meadows’s big mistake.

His big mistake was the following: he assumed that if the probability of the two deaths happening by chance was one in 73 million, then the probability that Sally Clark was innocent was one in 73 million as well.

“Courts, in the UK, US and elsewhere, have a bad history when it comes to the use of statistical evidence.”

But this is wrong. Crucially, catastrophically wrong. As wrong as assuming that because only one human in eight billion is the President of the United States, there’s only a one-in-eight-billion chance that the President of the United States is human.

Nonetheless, Meadows’s testimony helped convict Clark in 1999. She spent three years in jail before her conviction was overturned on appeal. Her life was, obviously, ruined. It will not surprise you to learn that she drank herself to death four years later, alone. It’s a haunting story.

The mistake made in Clark’s case is a subcategory of a far wider failure of reasoning, a failure I discuss in my new book, Everything is Predictable: How Bayes’ Remarkable Theorem Explains the World. But in legal circles it comes up again and again — often enough to have its own name: the “prosecutor’s fallacy”.

There was a recent article in the New Yorker about the nurse Lucy Letby, convicted of murdering seven babies in a neonatal ward. The online version is blocked in the UK, because of contempt-of-court laws, despite her appeal against the convictions being denied: she still faces a retrial on one count of attempted murder.

This piece is not about Letby. I do not know the facts of the Letby case and would not be allowed to write about them if I did; whether that is a strength or a weakness of British law, I leave to others to discuss. But I do know that courts, in the UK, US and elsewhere, have a bad history when it comes to the use of statistical evidence. To understand why, we need to go back to the work of an 18th-century nonconformist minister.

The Reverend Thomas Bayes’s eponymous theorem was published in a paper called “An Essay towards solving a Problem in the Doctrine of Chances” in 1763, five years after his death. Previous work in probability theory had answered the question: how likely am I to see some event, given a hypothesis? For instance, if we assume that my dice are fair, and I roll three of them, I can expect to see three sixes one time in 216. That’s called sampling probability.

Bayes should be the face of the new £50 note

By Graeme Archer

But most of the time, with statistics, we want to answer the opposite question: how likely is my hypothesis to be true, given some new event? That’s called inferential probability, and it’s a completely different thing.

Say I go to the doctor’s and I get a cancer test. It’s quite an accurate test: if I have cancer, it will correctly say so 99 times out 100; if I don’t have cancer, it will correctly say so 99 times out of 100. If I get a positive result, then, what’s the chance that I have cancer? Is it 99%?

No. The answer is you don’t know. At least, not with the information I’ve given you.

Imagine that this particular cancer is rare: only one person in 1,000 has it. You test 100,000 people at random. Of that 100,000, about 100 will have the cancer, and your test will pick up 99 of them. Of the remaining 99,900, your test will correctly say that 98,901 are cancer-free.

But that means that it will incorrectly say that 999 people do have cancer when they don’t. So of your 100,000 tests, 1,098 will come back positive, and only 99 of them are true positives. If you are one of them, then there is just a 9% chance you have cancer.

You can’t answer the question “How likely am I to have cancer, given this test?” without first answering the question “How likely did I think I was to have cancer in the first place?” That was Bayes’s big insight. You need what is called a prior probability. If the cancer was less rare, then your positive test would be more worrying: if one person in every 100 had it, then a positive result would mean about a 50% chance you have the disease.

It’s counterintuitive and weird. What do you mean, this 99% accurate test result is almost certainly wrong? But it is mathematically unavoidable.

You can probably see the bearing this has on statistical evidence used in court. Take DNA tests, for instance: you might do a DNA test on a sample from a crime scene. It matches a result on your database. There’s only a one in 3 million chance that someone’s DNA would match the sample by chance. Does that mean there’s only a one-in-3-million chance that your suspect is innocent? As you’ll realise by now, no it does not.

It depends on your prior probability. If your database is a random sample of the British population, then the prior probability that any given person is the culprit is one in 65 million. If you tested the whole population, you’d get about 20 matches just by chance.

But if you are a modern-day Hercule Poirot, and you’re only testing 10 people trapped by a snowstorm in a country mansion, then your prior probability is one in 10, and the chance it’s a false positive is about one in 300,000.

Real cases have turned on these details: a man called Andrew Deen was convicted of rape in 1990 on the basis of DNA evidence that an expert witness said there was only a one in 3 million probability of a chance match. His conviction was overturned — although he was found guilty again in a retrial — because a statistician pointed out that “How likely is it that a person’s DNA would match the sample, if they are innocent?” and “How likely is it that someone is innocent, given that their DNA matches the sample?” are very different questions.

In Sally Clark’s case, the problem was not testing, but clustering: two rare events happening simultaneously. But, again, the probability of seeing those events by chance is not the same as the probability that she was guilty.

Hers is far from the only case in which the use of statistics has raised suspicion. In 2022, the Royal Statistical Society wrote a report on statistical evidence in criminal trials, and noted that one of the most common reasons that medical professionals are accused of murders is because “an apparently unusual number of deaths occurs among their patients”.

But these cases are doubly difficult to evaluate, the report noted, because “they involve at least two levels of uncertainty”. As well as the normal uncertainty over whether an individual committed a murder, there is uncertainty over whether any murders occurred at all.

The Dutch paediatric nurse Lucia de Berk was convicted, in two trials in 2003 and 2004, of seven murders and three attempted murders of children under her care. A criminologist told her trial that “the probability of so many deaths occurring while de Berk was on duty was only one in 342 million”.

But even if that were the case — and again, it wasn’t; the RSS estimated that if you took into account all the relevant factors, the chance of seeing a cluster like that could be as high as one in 25 — that’s not the same as the chance that de Berk was innocent. In order to establish that, you would need to take into account the prior probability that someone would be a multiple murderer — a mercifully tiny chance. De Berk’s conviction was overturned in 2010, thanks in part to the work of Bayes-savvy statisticians.

Bayesian reasoning doesn’t only reveal the wrongly convicted — in some instances it could have led to the guilty being detected. During OJ Simpson’s trial for the murder of his wife and her friend, for instance, the prosecution showed that Simpson had been physically abusive. The defence, though, argued that “an infinitesimal percentage — certainly fewer than 1 in 2,500 — of men who slap or beat their wives go on to murder them” in a given year, so it wasn’t relevant to the case.

But this is simply the opposite mistake to the prosecutor’s fallacy. The probability that a man who beats his wife will go on to murder her in a given year might “only” be one in 2,500. But that’s not what we’re asking. What we want to know is, if a man beats his wife, and given that the wife is then murdered, what is the chance that he did it?

The scholar of risk Gerd Gigerenzer had a go at answering that. The base rate for murders among American women is about five in 100,000. Assuming the one-in-2,500 probability is correct, then of 100,000 women with abusive husbands, about 99,955 will not be murdered. But of the remaining 45, 40 will be murdered by their husbands. The correct probability that the husband did it should be nearly 90%. Bayesian thinking might have helped convict Simpson.

And although he does not use full Bayesian reasoning but a slimmed-down version, the insight that we should update our existing probability estimates with new information — cumulative monitoring, rather than one-off testing — could, argued the statistician Professor Sir David Spiegelhalter, have spotted both the catastrophe at the Bristol Royal Infirmary and the murders of Britain’s worst serial killer, Harold Shipman, earlier, and saved many lives.

The Lucy Letby case does not turn solely on statistical evidence, and I make no arguments about whether she is guilty or innocent here. But people, including juries, prosecutors and judges, have misunderstood probability in the past; Sally Clark and Lucia de Berk had their lives ruined by that misunderstanding. Thinking like a Bayesian might have helped prevent that.

Tom Chivers is a science writer. His second book, How to Read Numbers, is out now.

TomChivers

Join the discussion

Join like minded readers that support our journalism by becoming a paid subscriber

To join the discussion in the comments, become a paid subscriber.

Join like minded readers that support our journalism, read unlimited articles and enjoy other subscriber-only benefits.

73 Comments

Most Voted

Newest Oldest

Inline Feedbacks

View all comments

Ian_S

3 months ago

Sadly, Bayesian probability was never in my school curriculum. We did endless calculus, which I was good at, but have never once used since. I can think of many times Bayesian probability might have been useful in my career and research, had I known about it. But it seems to only have become “mainstream” in the past few years. And now, we see, its importance in mathematical literacy even extends to matters of justice.

Dennis Roberts

3 months ago

Reply to Ian_S

I don’t think it’s taught in schools now either. It should be.

Steven Carr

3 months ago

Reply to Dennis Roberts

What is taught is R.A.Fisher’s hypothesis testing, which is very different from Bayes theorem.

Carlos Danger

3 months ago

Reply to Steven Carr

Ronald Fisher was a foundational figure, the 20th century father of modern statistics. Thomas Bayes was a little-known clergyman and gentleman, who died in the 18th century and left behind some notes that had to be corrected and publicized by others. While Thomas Bayes did some remarkable work, it was not that remarkable. His contribution was niche, nowhere near the contribution that Ronald Fisher made.
That said, Ronald Fisher did have his faults. As noted, he kicked back against Thomas Bayes’s work in a churlish way that turned out to be wrong. In similar fashion and for similar reasons, Ronald Fisher also quarreled with the work of Richard Doll and A.B. Hill on whether cigarette smoking caused lung cancer. Causal inference and modern epidemiology have their origin in the work of Doll and Hill, despite the shade thrown their way by Ronald Fisher as he smoked his pipe to his death from cancer.

Steven Carr

3 months ago

Reply to Carlos Danger

You are correct.
Bayes Theorem is mathematically trivial. Just play with a Venn diagram and you will be able to work it out.
Fisher’s maths was stunning.
Nevertheless, Bayes Theorem is true and directly relevant to most science papers.
They ask the question, – I know the probability of the data given the hypothesis, but what I really want to know is the probability of the hypothesis given the data.
This is something that Fisher’s significance tests were not even designed to do.
Science research papers are often using the wrong tool – as wrong as using a thermometer to measure the rain.

Jonathan Andrews

3 months ago

Reply to Ian_S

Agreed, Calculus is regarded as the ultimate aim of A level mathematics and, while students study some Statistics now, it’s not enough.

Iain MacKay

3 months ago

Reply to Jonathan Andrews

very sad that calculus should be now the ultimate aim of A level mathematics when 50 years ago it was delivered by O level (precursor of GCSE).
This permitted the teaching of A level Physics at beyond a basic level as calculus is a prerequisite for the understanding of Newton’s laws, simple harmonic motion et al.
We may not use calculus in detail in everyday life but the conceptualisation of area-under-a-curve, exponential growth and other topics mean you can make evaluations and challenge numbers that are thrown at you all the time. We all need it.
Fortunately teenage brains have plenty of room for calculus, probability and much history and geography besides if we only supported them.

Lancashire Lad

3 months ago

Reply to Ian_S

Might i venture the opinion that if it were part of the curriculum, Bayesian probability would be so badly taught that it might prove to be worse than useless, potentially dangerous even?

The reason? Maths teachers themselves would understand it very poorly, and transmit it only to sow greater confusion. Calculus is taught because it’s “safe”.

I’ve no idea what the odds are of a Maths teacher having sufficient grasp of the real-world complexities to have both sufficient grasp of the theorem and be able to proficiently transmit that to students whose experience of the world is necessarily limited: perhaps someone could apply Bayes theorem to work it out?

Steven Carr

3 months ago

Reply to Lancashire Lad

Bayes Theorem is briefly taught in some A-Level courses.

Dennis Roberts

3 months ago

Reply to Steven Carr

A basic introduction to stats should be at GCSE. There’s no need to get deep into the maths, that would be too soon, just examples like this article provides would be eye opening yet understandable for 15/16 year old.

Lancashire Lad

3 months ago

Reply to Steven Carr

But earlier, you posted a comment suggesting it wasn’t !!

Steven Carr

3 months ago

Reply to Lancashire Lad

It isn’t taught very often. It is sometimes given a brief mention.
My apologies, I should have been more precise.

Lancashire Lad

3 months ago

Reply to Steven Carr

Thanks for clarifying.

John Wilkes

3 months ago

Reply to Lancashire Lad

Given that a high proportion of teachers taking maths classes have no maths qualifications at all, merely a teaching qualification (PGCE probably), it is unlikely that they understand much at all.

Norfolk Sceptic

2 months ago

Reply to Lancashire Lad

Calculus is needed for Physics, Chemistry, Engineering and similar degree courses.

And it’s understanding is needed for many STEM related jobs.

The reason for the ridiculous NET Zero policies is that Arts, Humanities Humanities and Social Science graduates, in the UK Dept of Energy, don’t understand the application of Mathematics, including Calculus.

Jim Veenbaas

3 months ago

Super interesting essay. Thanks for this.

Right-Wing Hippie

3 months ago

You can use statistics to prove any damn thing.

Jonathan Andrews

3 months ago

Reply to Right-Wing Hippie

You can play around but you eventually get found out.

Carlos Danger

3 months ago

Reply to Right-Wing Hippie

Lies, damned lies, and statistics.

Stephen Follows

3 months ago

Reply to Right-Wing Hippie

Only 98.6% of the time, though.

Fafa Fafa

3 months ago

The non-education of lawyers in scientific issues is probably a requirement for them to be able to make the most outlandish cause-effect claims in court without blushing.

Reminded me of a story, claims to be an actual transcript, even if it isn’t, is could be:

Lawyer: “Doctor, before you performed the autopsy, did you check for a pulse?”
Witness: “No.”
Lawyer: “Did you check for blood pressure?”
Witness: “No.”
Lawyer: “Did you check for breathing?”
Witness: “No.”
Lawyer: “So, then it is possible that the patient was alive when you began the autopsy?”
Witness: “No.”
Lawyer: “How can you be so sure, Doctor?”
Witness: “Because his brain was sitting on my desk in a jar.”
Lawyer: “But could the patient have still been alive nevertheless?”
Witness: “Yes, it is possible that he could have been alive and practicing law somewhere.”

Jeremy Bray

3 months ago

An important essay. The use of statistical arguments in Court is fraught with risk of injustice given the general ignorance of Bayesian probabilities. Increasingly cases turn on expert testimony that is often tainted because the experts don’t understand Bayesian probabilities and consequently the judge and jury are bamboozled.

Carlos Danger

3 months ago

Let me start with some praise. It sounds like Tom Chivers has written an interesting book, and I’m going to get a copy and read it. (Though I note the book says “Tom Chivers is an author and the award-winning science writer for Semafor. His writing has appeared in The Times (London), The Guardian, New Scientist, Wired, CNN, and more.” UnHerd, I guess, gets the ignominy of being in the “and more”.)
It’s good to see a book on how to better use statistics and probability. Those fields are not intuitive. As Daniel Kahneman taught us, we all have biases in our thinking that niggle us even when we know we are biased. No matter that I know differently, I’ll always feel that tails is more likely after several flips of a coin yields head after head. A reminder to be more analytical in my analysis always helps, and Tom Chivers’ book gives me that reminder.
That said, let me end with some criticism. Bayes’ theorem is important, but it’s not, “without exaggeration, perhaps the most important single equation in history”. Not even close. It’s a niche theorem, and rarely applies, though when it does, it’s elegantly helpful in countering our intuition.
Tom Chivers oversells the eighteenth-century reverend Thomas Bayes and his theorem by talking about Bayesian statistics and Bayesian reasoning. Those are ways of thinking that reflect Bayes’ theorem but lack its rigor and are more pseudo than science. Bayes had nothing to do with them. They are modern, casual extrapolations of his tight and tidy theorizing.
Bayes’ theorem requires that you know three different probabilities to calculate a fourth. Bayesian reasoning lets you guess your “priors”, adjust them based on your guess of what new information implies, and predict another probability based on that. It’s guessing, not analysis.
Tom Chivers says “Everything Is Predictable”, but don’t buy it. I mean, buy the book, but don’t buy the premise. Using “lies, damned lies and statistics” as a bolster for weak arguments relies, in its modern form, on Bayesian statistics. It’s the same old oversell, just dressed in new clothes.

Steven Carr

3 months ago

Reply to Carlos Danger

‘It’s guessing, not analysis’.
No, it is iteration. Guesses for probabilities get repeatedly improved until they are extremely good, This is how AlphaZero became so good at chess.
It started with literally a uniform probability distribution for chess moves, making moves at random, and used the results to update the priors. Within hours it was unbeatable by humans.
A lot of research papers use Fisher’s 5% significance test, and ignore Bayes.
This has resulted in a replication crisis in psychology and medicine, with disastrous results.
For starters, if you have a 5% significance level, that guarantees that 1 in 20 research papers have wrong conclusions.

Carlos Danger

3 months ago

Reply to Steven Carr

Good point about AlphaZero using some probability analysis in their algorithm. But though that analysis has some vague connection to Thomas Bayes’ theorem, giving him credit for what people are doing when training hugely sophisticated machine learning algorithms today is like calling Richard Babbage the inventor of the computer and Ada Lovelace the inventor of computer programming.
Also good point about Ronald Fisher’s significance test, though I think the p-hacking abuse has tainted a fairly reasonable theoretical approach.
My gripe is with science writers like Tom Chivers choosing a topic like Bayesian statistics or reasoning or inference or whatever and making it sound like it’s the greatest thing ever. It’s tiresome, but I guess they need to do it to sell books. I still marvel that Rebecca Skloot made so much out of The Immortal Life of Henrietta Lacks.

Michael Lipkin

3 months ago

Reply to Carlos Danger

Sure its guessing – but the point is that the guesses are made explicit and appear in one place (the prior) rather then hidden within other processes

Carlos Danger

3 months ago

Reply to Michael Lipkin

It’s not just the prior that you must guess at.

William Edward Henry Appleby

2 months ago

Reply to Carlos Danger

Users of Bayes’ Rule can also naively simplify things further by choosing conjugate priors to make the algebra simpler, but in fact the real world is never that clean and knowing the posterior distribution is often intractable in practice.
IMHO, there was no need to appeal to Bayes’ Rule in the Sally Clark case: a simple use of conditional probability should have convinced the jury that P(cot death) < P(cot death|cot death) by appealing to biological and/or genetic arguments; that’s the defence’s mistake.

William Edward Henry Appleby

3 months ago

Reply to Michael Lipkin

And how do you get the prior? No one explains that bit properly.

Prashant Kotak

3 months ago

Thank you for this. Also worth noting, is that the outflows of Bayes’ Theorem, specifically Bayesian inference, are implicitly tied into many machine learning techniques, including deep neural networks (deep learning). Also worth noting, is that Reverend Bayes was minister in Tonbridge Wells, an area of Kent that has been true blue for two hundred years, but is about to ditch their Tory MP for the first time ever. I wonder if even Reverend Bayes would have voted Tory in the forthcoming election, or would he have been ‘disgusted of Tonbridge Wells’ this time.

Andrew D

3 months ago

Reply to Prashant Kotak

Tunbridge Wells has only been a constituency since 1974

Prashant Kotak

3 months ago

Reply to Andrew D

Sure, as a constituency boundary. But the area…

Dougie Undersub

3 months ago

Reply to Prashant Kotak

It’s Tunbridge Wells. Confusingly, Tonbridge is a different, albeit fairly nearby, place.

Prashant Kotak

3 months ago

Reply to Dougie Undersub

Yeah, spelling is not my strong suit, and I stubbornly refuse to turn on autocorrect in the vain attempt to improve it, so my spelling can be rather, um, variable. Although Reverend Bayes is now claimed by Tunbridge, I seem to recall reading he was a minister in Tonbridge – perhaps the latter is now instead the voting constituency of the former, or there has been a name change since his time.

Rob N

3 months ago

Personally I don’t think statistics have any place in a court. The chance of my number sélection winning the Lottery is, say, 1 in 45 million, and so very unlikely. So, the argument goes, that is so unlikely that, if I win, it must have been due to cheating. Exceptionally unlikely events happen every day and their rarity says nothing about their legality.

Jon Morrow

3 months ago

Reply to Rob N

You have just as much chance winning the jackpot if you don’t buy a ticket.

Warren Trees

3 months ago

Reply to Rob N

Good point. Any survivor of a lightning strike will agree with you.

Steven Carr

3 months ago

SO lawyers go p-hacking to get something, anything, that sounds good for their case, ignoring the real statistics?
Who knew?

Dennis Roberts

3 months ago

Reply to Steven Carr

And the lawyers on the other side should be able to see the flaws and present their clients side

David Morley

3 months ago

I’m no statistician, but isn’t there an easier way of looking at this. The chances of a specific identified person going on to have two children die of SIDS is very low. But the chances of someone, somewhere have this happen is far higher.

Until she had this happen to her, Sally Clark was just someone, somewhere. It is only the event itself which makes her a specific person.

Likewise the chances of you, the reader, being struck twice by lightning are vanishingly small. But in a world of billions the chances of this happening to someone are far higher. Perhaps it is even likely. And this does not mean there is anything special about the person struck.

Dennis Roberts

3 months ago

Reply to David Morley

Which is part of what the court missed. Sally Clark’s Lawyers should’ve been able to make that point but I believe Roy Meadows could not be questioned.

Jonathan Nash

3 months ago

Reply to Dennis Roberts

I would be very surprised if Meadows could not be questioned, but even if that were true it would not stop the defence team pointing out the obvious fallacies in his reasoning. I believe her appeal was allowed because the defence failed to do this, making her conviction unsafe.

Dennis Roberts

3 months ago

Reply to Jonathan Nash

My understanding was that expert witnesses such as Meadows could not be cross examined – their testimony had to be taken as Gospel. I don’t know if this is still the case.

But yes, the defence should have brought in their own expert witness sonehow, so perhaps they didn’t understand.

Seb Dakin

2 months ago

Reply to David Morley

Quite. My maths is dismal but my first thought was that a one in seventy-three million chance of it happening meant that given the millions of situations where one child dies of SIDS, sooner or later there’ll be some poor soul who draws the short straw twice. You don’t need maths as such, just simple reasoning. I’m surprised her original lawyer didn’t spot it.

UnHerd Reader

2 months ago

Reply to Seb Dakin

It is bizarre. If your car is stolen from outside your house, you don’t say Wow, only 1 in 100,000 cars is stolen every year, so clearly the odds are I’ll never have another one stolen again. You say 5h!t, I obviously live in an area where cars get stolen, so losing one means it’s more likely, not less, that I’ll lose another.
Even if you don’t work that out for yourself right away, you will cotton on soon. Right about when you see what your insurance company charges you to insure your next car.

John Riordan

3 months ago

Even cropped up during the pandemic, with the silly Benjamin Butterworth on GBNews defending the ludicrously-badly crafted testing policy on the basis that any false positives would be cancelled out by the false negatives, thus assuring us all that the stats on infection rates were trustworthy.

Stephen Follows

3 months ago

‘The Lucy Letby case does not turn solely on statistical evidence’

Doesn’t it? Where’s the forensic evidence? Where’s the witness evidence? Where’s the corroborating evidence of bad character?

JR Stoker

3 months ago

Reply to Stephen Follows

There is some limited evidence available, and that does tip any statistical probabilities. In the Clark case there was no evidence of ill-doing whatsoever: the judge should have thrown the case out immediately and not relied on statistical mumbo jumbo; there was simply no proof of crime resulting in an appalling miscarriage of justice

Lancashire Lad

3 months ago

Reply to Stephen Follows

Her diary (or what was found to have been written by herself)?

Fran Martinez

3 months ago

And you have a bad history of pushing for unnecessary treatments to healthy people!

Gordon Black

3 months ago

A politician. a judge and a statistician went hiking in Scotland. They spotted a resting black sheep.
Politician – “Look, the sheep in Scotland are black!”
Judge – “Hang on … at least one sheep in Scotland is black.”
Statistician – “Hang on … black on one of its sides for sure.”

Arouet

3 months ago

Great article. I’m continually appalled by the lack of statistical understanding in public debates, where an association between two variables is used is evidence of a strong causal relationship. Some examples:
Ignoring variance. Everyone has heard that SAT scores are correlated with family income (I live in the US). I’ve even read articles that state “Tell me someone’s family income, and I’ll tell you their SAT score”. This completely ignores the high variability, where the variation between individuals within the same income group (and even the same family) swamps the difference between groups. I know students brought up by a single mother with nearly perfect SATs, and others from wealthy families with scores much lower. But saying “Tell me someone’s family income and I’ll tell you their SAT score plus or minus 250 points” doesn’t sound so impressive.
Ignoring confounding variables. The association between two variables tells us nothing if other variables of interest are ignored. This particularly happens in discussion on how income relates to race and sex. Interviews with activists on NPR talk, for example, about how black men with PhDs earn less than white men. This completely ignores the effect of subject studied, which also has a major impact on income. It’s entirely possible for there to be no racial income difference within every subject, but a large difference overall, due to more black men studying subjects with lower earnings potential and less STEM-based subjects. Similarly for the male-female earnings gap, which shrinks from about 16% overall to about 1% once other factors such as profession and hours worked are taken into account.
Placing too much emphasis on statistical significance. Statistical significance tells us how likely it is that some difference arises by chance. Given a large enough sample size though, small differences can still be highly “significant’, as in highly unlikely by chance. What’s much more interesting is the effect size – how big the difference – and how much of the variation is explained by it. Articles stating that something is found to be “highly significant” would be much less impressive if they also stated that it only explains 5% of the overall difference. We’re always led to assume that it explains everything, in accord with whatever agenda the writer is pushing.
All in all, this is always done to support simplistic narratives, such that all individuals within one group can be classified in one way and all individuals within another group as another. Large overlaps and other contributing factors are ignored.

Steven Carr

3 months ago

Reply to Arouet

Quite correct.
Also factor in Simpson’s Paradox.
And also the fact that if A is correlated with B and B is correlated with C, then it is not always true that A is correlated with C. This leads to endless Daily Mail articles about how X both cures and causes cancer

Jürg Gassmann

3 months ago

Bayes was also sadly absent during “Covid”. All manner of “probabilities” – positive PCR test, effectiveness of the shots – look completely different once this tool is applied (leaving aside the dismally bad quality of the base data to begin with).

[email protected] [email protected]

3 months ago

One of the problems in the case of Sally Clarke was the assumption of a default.
A particular event must have been caused by A or B. A is extremely unlikely so by default, it must have been B. But what if B is also extremely unlike?
As stated, one in 8,500 children die from SIDS. I do not know the figure but let’s say for the sake of argument that one in 100,000 babies are mourdered by their mothers. The chance of two babies being murdered by their mother is the probability squared, so one in 10,000,000,000. Even if you take into account clustering, it is still less likely than SIDS.
It is too unlikley to be SIDS on both occasions so it must be murder becomes it is too unlikely to be murder so it must be SIDS.

Alex Lekas

3 months ago

If you want to confuse lawyers and reporters, use numbers. The misuse of statistics was evident during the pandemic when the fear-mongering became the basis for masking, lockdowns, and the other things that did not work.

Max Beran

3 months ago

Climate “science” is rife with the same issue of not understanding you can’t simply reverse the conditionality and keep the same probability but need to bring in other countering probabilities. To be more specific, the IPCC was set up to be confirmatory of the hypothesis that mankind is responsible for climate change. This translates in terms of the Bayes formula to asking how likely are specific items of evidence given the truth of the hypothesis – an exact analogue of the prosecutor fallacy. So, for example, “given” man is warming the planet, then the probability that sea level is observed to be rising is high, ditto for all the other poster children of climate change. The probabilities are then switched to make it appear that they apply to the originating hypothesis, again just like the prosecutor fallacy.
But that’s not the way kosher science works – it poses the issue the other way round – “given the observations what is the probability that some stated hypothesis is true”. Bayes formula provides the means for deriving the one from the other and so reverse the conditionality, but requires missing elements to do so, in particular the probabilities derived from all those observations that don’t match the hypothesis (like glaciers melting long before CO2 started its rise), and those probabilities obtained from hypotheses that the observations are also compatible with (like natural variation).
So we are left with a topsy-turvy science in which the reality of man-made change becomes the null hypothesis to be rejected or accepted. As a consequence we are treated to that unfamiliar and jarring formulation in the IPCC WG1 chapter on extreme events where they report the confidence in the observation rather than reporting the confidence in the explanation of the observation (which of course has been posited as unassailably settled and true). So they report, for example, that there is low confidence in an increase in flood magnitudes and frequency rather than pointing out that the absence of this anticipated observation makes the hypothesis less believable and would in other areas of science be a reason to reject it. At the same time the slack wording leaves the impression and holds out the expectation that it’s just a matter of time and gathering more data then we’ll know for sure that man-made climate change makes flooding more severe.
I strongly suspect those involved realise deep-down they are doing something fishy with the numbers and is one of the reasons it is termed “the” science, the definite article as an equally unfamiliar and jarring formulation to distinguish it from proper science where data are allowed to reject or accept a hypothesis.

Norman Powers

2 months ago

Reply to Max Beran

For sure they realize. Climatology doesn’t only invert causal inference, it inverts temporal causality! No way to accidentally do that without realizing. They regularly revise old “observations” from weather stations, as in, they literally rewrite the past on a continuous basis. A temperature of X degrees observed at time T at station S will silently become a temperature of X +/- 0.1 degrees, over and over. Only sceptics who happen to have downloaded old versions of the data notice these changes, which happen Big Brother style in the middle of the night with old data going down the memory hole.
“In reality the past is fixed but the future can be changed. In climatology the future is fixed but the past can be changed”.
These rewrites invariably cool the past and warm the present, creating the global warming with which we are so familiar. They also routinely invalidate the data on which thousands of research papers were built. In a real science that would be a massive embarrassment and problem, in academia nobody cares because it’s all fraud and always has been.

John Riordan

3 months ago

Another one of these statistical quirks is my favourite: the Monty Hall paradox. On the Monty Hall show it was which of three doors has a goat behind it, but of course it’s the old three cup shuffle trick where you try to guess which cup has the marble in it.

The interesting part is this: after the shuffle and after you’ve made your guess, the shuffler then removes one of the cups. Do you change your choice, or hold? The answer – which almost nobody is able to work out from first principles – is that you should switch your choice to the other cup. Most people assume that the original 33.33% chance must simply move to a 50% chance on either remaining cup, but this is not the case: the original guess at 33.33% remains in place, but the rest of the probability, 66.67% now resides in the remaining cup. So, you should switch.

The reason this doesn’t contradict the laws of probability is that because the shuffler always knows where the marble is and will never remove the cup containing it, he is providing to you a new piece of information at the point where he removes a cup he knows to be empty, and that piece of information alters the balance of probability relating to the original 1 in 3 chance of success, but doesn’t stop it being a 1 in 3 bet.

Carlos Danger

3 months ago

Reply to John Riordan

That’s a great description of the Monty Hall problem that “smartest person in the world” (IQ of 228) Marilyn vos Savant made famous. Like a lot of probability theory the answer that you should switch doors to get better odds is quite counterintuitive, but if you think about it once you know the answer it’s not hard to figure out why the odds turn in favor of switching. Bayes’ theorem seems like it ought to help, but it doesn’t really (at least for me).

John Wilkes

3 months ago

Meadow, who was later struck off, said in relation to another case something like ‘one death is a tragedy, two is suspicious, three is murder’.
Manipulation of statistics can be seen too throughout the world of politics practically every day. We need good statistical knowledge to protect ourselves from state propaganda.

Max Beran

3 months ago

Reply to John Wilkes

and from glib off-the-cuff remarks like those of Right-Wing Hippie above.

John Tyler

3 months ago

As much research has shown, not only do juries and legal eagles have little understating statistics, but are also subject to various biases that warp their perception of even the finest statistical calculations.

Carlos Danger

3 months ago

Reply to John Tyler

Clever lawyers can exploit biases to turn trials from a analytical search for truth to an emotion-driven farce. We are seeing that with the trials of Trump. The civil trials that resulted in a $450 million fraud award when no one was injured, and an almost $90 million defamation/rape award on patently weak evidence. Not to mention 4 felony indictments with 91 charges that carry jail terms lasting several lifetimes for “crimes” where there is not a single identifiable victim.

Geoffrey Kolbe

3 months ago

Thinking like a Bayesian is actually remarkably difficult. I was talking about Bayesian statistics to a Professor of Mathematics at Glasgow University, and he admitted that after teaching Bayesian statistics for 20 years he was only now getting the hang of it…

UnHerd Reader

3 months ago

Excellent article. Thank you very much

Richard Roland

3 months ago

A paediatrician, Roy Meadows, called as an expert witness for the prosecution, told the court that the probability of the two deaths happening by chance was one in 73 million: that is, 8,500 times 8,500.

Why would a judge allow a paediatrician as an expert witness on calculating probabilities? Perhaps he was not called as an expert in that, but in paediatrics. But upon his straying into an area far removed from his competency, the defense should have objected and the judge sustained the objection.

William Edward Henry Appleby

3 months ago

Bayes theorem isn’t all it’s cracked up to be – where do you get the prior distribution from? It’s partly faith-based statistics.

UnHerd Reader

2 months ago

I will be buying Tom Chivers book because the covid nonsense woke me up to the power of statistics and how they can be manipulated. Prior to the pandemic I always thought statistics were very boring. This is off topic but I have been shocked at publications about the number of deaths saved in Scotland by the covid vaccines. Towards the end of 2021 the WHO published a report, which was picked up by mainstream media and many politicians, that estimated that the vaccines saved over 27,000 lives in Scotland. I had been watching the data on deaths in Scotland and the claim seemed way too high. I tried to engage with anyone involved in the report but got nowhere.
Just the other day I came upon an article on PHS website claiming that the vaccines saved over 22,000 lives in Scotland. They referred to the earlier article and stated that the figure has been updated. Now I am pleased that they have reduced the outlandish claim of 27,000 but 22,000 still seems unreasonably high. I am beginning to wonder if it is me, with my very small knowledge of statistics that is wrong. If anyone can explain it to me I’d be very grateful.
To clarify, my thinking is not based on a hunch or belief that the vaccines didn’t work (although I have my doubts about their effectiveness) but on the mortality numbers published on the PHS website. All cause mortality in 2019 was 58,108, in 2020 it was 64,054, in 2021 it was 63,584 and in 2022 it was 62,942. How likely, therefore, is it that the vaccines saved over 22,000 lives?

Roger Tilbury

2 months ago

You’d think you’d get Roy Meadow’s name right…

Marcus Corbett

2 months ago

Tom Chivers was utterly discredited during Covid times and should remain discarded by unherd. Sophisticated sounding hogwash that at the time was intellectually indefensible.

Mark Kennedy

2 months ago

Reply to Marcus Corbett

Alas, we now know that ‘utterly discredited during Covid times” and “utterly wrong” or “intellectually indefensible” aren’t coincident sets.

Mark Kennedy

2 months ago

Another statistical reality curiously overlooked is while the probability of something happening being just once in a million times does indeed make the occurrence relatively rare, it’s also a guarantee that once every million times, on average, the thing will happen.

	This comment is spam
	This comment should be marked mature
	This comment is abusive
	This comment promotes self-harm
	Other