What Donald Trump Gets Wrong About the Polls

October 24th 2016

Kyle Jaeger

In an effort to explain away unfavorable polls, Republican presidential nominee Donald Trump embraced an increasingly popular conspiracy theory on Monday. He argued that polling firms, allegedly colluding with his opponent's campaign, were intentionally surveying more Democrats than Republicans. In short, Trump says the polls are rigged.


This, he said, amounted to a "voter suppression technique." The problem is that oversampling — the process of surveying additional members of certain demographics — is "a completely valid statistical practice that everyone uses, including Republican pollsters and probably Trump’s own campaign," The Atlantic's Andrew McGill wrote.

Let's look at one of the latest national polls on the presidential race, for example. This CNN/ORC poll has Democratic presidential nominee Hillary Clinton ahead by six points.

If you look at the survey methodology, the pollsters reported that 37 percent of the likely voters included in the survey identified themselves as Democrats, while only 30 percent identified as Republicans. (The remaining respondents "described themselves as independents or members of another party").


Does that seven percent difference in party affiliation effectively nullify Clinton's six-point lead? Not quite.

Oversampling is actually a common practice with a legitimate function. It allows pollsters to gather and analyze data on specific demographics that aren't fully represented in the original sample. While the overall sample size is considered scientific, pollsters might've needed to compensate for a lack of respondents in certain regions of the country, income levels, age groups, or ethnicities. In order to do so, they might reasonably end up with more respondents who identify as Democrats.

The Pew Research Center broke down the logic behind oversampling here.

"For some surveys, it is important to ensure that there are enough members of a certain subgroup in the population so that more reliable estimates can be reported for that group. To do this, we oversample members of the subgroup by selecting more people from this group than would typically be done if everyone in the sample had an equal chance of being selected. Because the margin of sampling error is related to the size of the sample, increasing the sample size for a particular subgroup through the use of oversampling allows for estimates to be made with a smaller margin of error. A survey that includes an oversample weights the results so that members in the oversampled group are weighted to their actual proportion in the population; this allows for the overall survey results to represent both the national population and the oversampled subgroup."

Think about it this way. If a pollster wanted to know how people living in urban areas felt about the economy in a survey of, say, 1,000 likely voters — but only 200 of the respondents lived in urban areas — then they'd need to survey more urban residents to reach a scientific conclusion about that demographic's views. People living in urban areas tend to identify as Democrat, so that oversampling could appear biased. But that kind of oversampling only applies to the relevant questions; pollsters adjust for the oversample when it comes to questions relevant to the general population, such as presidential preference.

So what we have here is oversampling but no evidence of bias. If Democrats really wanted to rig polls in Clinton's favor, they'd save some money by manipulating the survey at the outset, Republican pollster Jon McHenry told The Atlantic.

"If you wanted to just bias the poll, you wouldn’t waste the extra money making all these extra calls — you’d try to manipulate it from the beginning," McHenry said. "This is an added expense to the pollster with the idea of getting more information about a certain subgroup, and weighting that back so you understand the overall as well. Sometimes the polls say what they say because they’re accurate."

[h/t The Atlantic]