Are the polls underestimating Trump

2024 us election Donald Trump Focaldata Kamala Harris Politics Polling US

September 24, 2024 - 10:20am

In both the 2016 and 2020 presidential elections, pollsters underestimated Donald Trump’s level of support. Using 538‘s election day polling averages, the Democratic candidate’s national lead over Trump was 3.9 points lower than the polls predicted in 2020 and 1.8 points lower in 2016. This pattern looks set to continue with Kamala Harris in 2024, and part of the issue stems from how pollsters estimate turnout.

Social desirability bias is a well-studied phenomenon in the field of research. In simple terms, it relates to the tendency of people to provide survey responses that may be viewed more favourably by others, such as whether they voted in an election or whether they give money to charity.

Given the socially desirable impact of electoral participation, voters are — generally speaking — not particularly good at assessing their own likelihood of voting. Surveys which simply rely on self-reported turnout may therefore be subject to an added degree of bias in their results.

To combat this problem, Focaldata devised a turnout model to estimate the likelihood of a respondent in our US election surveys actually voting, rather than simply relying on their own estimation. To create it, we used the 2020 Cooperative Election Study (CCES) panel of 60,000 respondents, which were matched to the voter file to determine whether each respondent actually voted in the election.

The CCES allows us to determine whether a person’s self-declared likelihood to vote reflects their subsequent turnout. On the surface, a reasonable estimate for a pollster might be to assign “certain” voters a 100% likelihood of voting, “probable” voters somewhere around 75%, undecided voters a 50-50 chance, and “would not vote” 0%. In reality, these figures do not correspond with actual voting behaviour.

Respondents overestimate their likelihood of voting in US election

Self-reported likelihood vs validated turnout in 2020

In 2020, over a quarter — 27% — of people who said they were certain to vote, going to vote early, or had already voted, did not actually vote in the presidential election. Even more strikingly, those who said they would “probably” vote only turned out 23% of the time. In addition, a respondent saying they will not vote does not entirely preclude them from voting — 5% of those who said they wouldn’t vote actually did.

A respondent’s self-declared likelihood is important, but it should not be the sole factor in a turnout prediction in an opinion poll. Some pollsters do not even assess likelihood of voting, instead relying solely on registered voters to generate their headline results. Implicitly, a registered voter poll assumes every voter has the same probability of voting — provided they are registered — which we know empirically is not the case.

If rates of overstating turnout were similar across different demographic groups, the turnout weighting problem for pollsters would be quite small and its effects would mostly cancel each other out. However, there are significant differences in reported versus actual behaviour by age group and education level, making the problem significantly larger.

Consider voters under 35. In 2020, young voters who said they were “definitely” going to vote only voted around half the time. Among those aged over 65, the figure shoots up to 85%. Similarly, those with high levels of education are much more likely to correctly assess their probability of voting. 80% of “definite” voters with postgraduate degrees turned out, and just 1% who said they wouldn’t vote ended up voting. In contrast, only 63% of self-declared “definites” who didn’t graduate high school voted, and 5% who said they wouldn’t vote did.

If we were to simply assume “definitely” means the same thing across different groups, we would end up with poll results too heavily skewed towards the views of younger, non-white and lower-education voters. Two of these three groups lean heavily towards the Democrats, partially explaining why the party's candidate has been overestimated in the polls at the last two presidential elections.

Using a sophisticated turnout model which takes into account the effects of self-reported likelihood to vote — alongside other demographic factors such as age, race, education and political interest — reduces Harris’s lead over Trump by an average of 2.4 percentage points in our latest wave of swing state polls. In an election which could be decided by just 60,000 voters in November, this margin could easily be the difference between a right and wrong call on the election winner. Pollsters who simply rely on self-reporting may be subject to another polling miss in Trump’s favour.

This is an edited version of an article originally published by Focaldata.

Patrick Flynn is a data journalist at Focaldata

patrickjfl

Join the discussion

Join like minded readers that support our journalism by becoming a paid subscriber

To join the discussion in the comments, become a paid subscriber.

Join like minded readers that support our journalism, read unlimited articles and enjoy other subscriber-only benefits.

Name*

Email*

26 Comments

Most Voted

Newest Oldest

Inline Feedbacks

View all comments

	This comment is spam
	This comment should be marked mature
	This comment is abusive
	This comment promotes self-harm
	Other

Are the polls underestimating Trump — again?

By Patrick Flynn

Latest from the Newsroom

Netflix’s Warner Bros deal betrays its original purpose