How to read a Political Poll!

The sheer number of polls this political cycle is amazing.  The sheer number of bad polls this political cycle is stunning.

Regardless, the manner the press reports on polling is just God-awful.

Biggest Polling Complaint

My largest complaint is that the amount of uncertainty in a poll is not reported, is under-reported or misunderstood.

A poll is only a sample (hopefully of a random one that is well constructed) of a population.  When any pollster moves from describing the sample to inferring meaning in a population, a known amount of uncertainty is introduced into the results.

The press or most poll consumers are explicitly considering nor communicating the amount of uncertainty in ANY poll.

So here is a quick primer on “How to read a damn political poll”:

Don’t forget the inherent uncertainty in polling

Margin of error – This is the stated uncertainty involved when moving from a description of the sample to an inference towards the population.  Often you will see the margin of error expressed as +- x%.

Some often to forget this margin of error is supplying a range of correct answers.

For example, let’s say candidate Ben Smith is polling with high negatives.

52% of the voters in our sample, think Ben Smith is a full on jerk.  The margin of error in this survey is the standard +-5%.

This margin of error is a function primarily of the size of the sample – the larger the sample, the smaller the margin of error.

NOTE:  Margin of Error does NOT take into consideration or capture errors associated with question order, format, sample error or other factors that could systematically bias a poll.  

This means when we move from describing a sample to the population as a whole, ANYWHERE between 47% and 57% of the population think Ben Smith a jerk.  Statistics tell us the correct answer is likely within this entire range.

visualising_uncertainty_1

visualizing uncertainty in a poll

We say likely because of the seldom reported confidence intervals.

Confidence Intervals – most of the time, most poll statistics are analyzed at a 95% confidence interval.

This means roughly, if we theoretically repeated this exact poll 100 times, 95 of the 100 times we would expect the real number to be in the range indicated by the margin of error.

HOWEVER, you must also notice 5 out of 100 times our answer could be OUTSIDE the correct range.

Combine both of margin of error and confidence intervals- even before humans are entered into the process – and you see there is uncertainty built into the very framework of polling.

It’s math.

Don’t forget bad polls

Now, let’s consider the human factors.

On top of all inherent uncertainty in ‘perfect’ polling, there are some political and media players taking short cuts in polls compounding errors.

One of the basic prerequisites of good polling is that anyone within your sample frame has an equal and random chance of being selected.

There are two current factors acting on the polling industries:

Response Rate Declines – the polling industry is seeing declining response rates, meaning you’re not answering your phone.

While the research is mixed on the effect of declining response rates with Pew finding in 2012, “despite declining response rates, telephone surveys that include landlines and cell phones and are weighted to match the demographic composition of the population continue to provide accurate data on most political, social and economic measures.”  (source: Pew Research)

The Rise of Cell Phones

In 2014, the Centers for Disease Control and Prevention, estimated that 39.1% of adults and 47.1% of children lived in wireless-only households. This was 2.8 percentage points higher than the same period in 2012.

This change dis-proportionally affects young people (Nearly two-thirds of 25- to 29-year-olds,) and minorities (Hispanics most likely to be without land line).  (source:  Pew Research)

The rise of cell phone only homes is problematic for pollsters, because the law forbids pollsters from using automation software to place the calls.  This prohibition is a direct cause in the increase of polling costs.

Combination of Response Rates and Cell Phones

We can safely assume these trends of declining response rates and the increasing cell phone only voters continue.

In summary, voters with landlines aren’t answering and reaching cell phone-only voters is expensive.  It is this combination that makes good, quality research difficult and expensive.

The media and polling

I loathe to blame the media, but in this case, there may be some justification – the media’s treatment and use of polling results is awful.

Explaining polling is nerdy.  The media may report on a poll’s margin of error, but never explains it.

I often asked people what they think ‘margin of error’ means.  A sample of replies:

  • 5% of the results are wrong
  • The results can be off as much as 5%
  • Anything within 5% is a statically tie
  • We are confident the answer is within 5% of the result

NONE of these are correct.  It seems people often forget the ‘plus’ or ‘minus’ part of the stated error.

The media hardly ever states the confidence levels.

The media due to the economics of the news industry wants polling done cheaply.   They will use samples that exclude the costly cellphone only homes.

The media due to the nature of news (‘if it bleeds, it leads’) wants the horse race.  The need the excitement.  They simply can’t have a headline or a report that extolls “the race may or may not be tied, we don’t really know due to uncertainty.”

The bottom line is some media outlets aren’t concerned with accuracy or nuance.  Others are cheap include only landline homes, excluding a significant part of the population.

In summary, the media presents results with a level of certainty that doesn’t exist.   For example, a presidential candidate is shown to have a 2 point move from last weeks polling numbers.  The poll has a +-5% margin of error.  In this case scenario, the number is bumping around with the range we would expect.  It is nothing more than the expected sampling error.

However, some media would have a headline “Candidate X Surges 2% in Newest Poll”

Lastly, don’t forget purposeful manipulation.

Let’s face it, there are a lot of political operatives and media outlets playing games with polls.

  • Only selected results are released
  • Questions are purposely written to shade responses
  • Awful samples (self-selected) are used and passed off as actual research
  • Results are attributed with certainty when it doesn’t exist

These political manipulators understand the powerful effect other peoples’ behavior has on voters.  Everyone loves a winner and the herd mentality takes over.

“Candidate X Surges 2% in Newest Poll” is most likely spin from a political operative.

How to read a poll

So as a consumer of a poll, there are some things you can do to increase your understanding of what a poll says and doesn’t say.

STEP 1 – Understand Poll Methodology

Before you read a poll results, read the methodology.

The methodology should tell you how the pollster conducted the poll.

The American Association for Public Opinion Research (AAPoR), declares the following items to be disclosed:

  1. Who sponsored the survey and who conducted the survey,
  2. Exact wording of questions and response items,
  3. Definition of population under study,
  4. Dates of data collection,
  5. Description of sampling frames, including mention of any segment not covered by design,
  6. Name of sample supplier,
  7. Methods used to recruit the panel or participants, if sample from pre-recruited panel or pool,
  8. Description of the Sample Design,
  9. Methods or modes used to administer survey and languages used,
  10. Sample Sizes,
  11. A description of how weights were calculated,
  12. Procedures for managing membership, participation and attrition of panel, if panel used,
  13. Methods of interviewer training, supervision and monitoring if interviews used,
  14. Details about screening procedures,
  15. Any relevant stimuli, such as visual or sensory exhibits or show cards,
  16. Details of any strategies used to help gain cooperation (e.g., advance contact, compensation or incentives, refusal conversion contacts),
  17. Procedures undertaken to ensure data quality, if any. e. re-contacts to verify information,
  18. Summaries of the disposition of study-specific sample records so that response rates for probability samples and participation rates for non- probability samples can be computed,
  19. The unweighted sample size on which one or more reported subgroup estimates are based, and
  20. Specifications adequate for replication of indices or statistical modeling included in research reports.

119637_600

Make no mistake, seldom does any political pollster release all this information.  (Academics and government surveys often will), but there are some minimal, critical things to consider:

  • Sample Size – sample size drives margin of error.  The larger the sample, the smaller the MoE.
  • Definition of people being studied?  Registered versus Likely Voters?
  • Sample Frame – how is a pollster defining who has a chance to be polled.   Is a prerequisite past voting behavior?  Is it simply registered voters?  Land line only?  Combination?   Does Sample Frame match as closely to the Definition as possible?  Who is left out?
  • Mode(s) used to administer the survey? Telephone type, Internet, door to door?  Combination and if so what proportion?
  • Is the poll weighted? If so, weighted to what model/universe?
  • What is the Margin of Error?

Step 2 – Understand the Polling Demographics

After finding this information, the next is to look at is the Demographics of the poll.  Do things look correctly and in proportion?

If you don’t know what the proportions should be for the population you are studying, results presented could be biased.

Due to the importance of partisanship in our political system, take a close look at the partisan breakdown.

If things look off in the demographics, be cautious and skeptical.

Are you looking at weighted numbers?  If so, what assumptions are built into the weights?

Step 3 – Look at the Polling Questions

Are the polling questions clear?  Are the polling questions loaded with explosive or loaded language?

Are the polling questions not disclosed?

Step 4 – Look at the Polling Credibility Items

What is disclosed?

If NOTHING is disclosed, stop reading the poll or press story.   You’ll likely get the same information from reddit.

Does the poll pass the smell test?

Step 5 – Read All Polling Results with skepticism

Finally, if everything so far passes the smell test, then look at the polling results UNDERSTAND that any number presented as a finding has a range of correct answers.   I find it helpful to restate a result to make remind myself of the uncertainty inherent in polling.

If a candidate has 40% hard name id with a 5% MOE –

You can state this mentally, “that candidate x is known by 35%-45% of the population studied, AND 65%-55% of the population studied doesn’t recognize him/her.  PS – THERE IS STILL A CHANCE THIS IS WRONG.”

Here is the bottom line on Reading Polls:

  • Remember, the quality of the polling data will be no better than the most error-prone features of the survey.
  • All polling when making inferences of a population contains inherent uncertainty due to fundamental math.
  • Polling well is difficult and expensive.
  • Always ask yourself, is a political operative attempting to manipulate you?

About Alex Patton