The limits of daily polling numbers

Feb. 8, 2008, 8:30 PM UTC / Source: National Journal

By By Mark Blumenthal

Since its public debut eleven days ago, the Gallup Daily national survey has experienced a surge of interest among those of us who obsess over campaign politics. Everyone, from Clinton pollster Mark Penn (who cited the Gallup tracking survey in a post-Super Tuesday spin memo) to our readers on Pollster.com, seems to be watching each daily release and reacting to and questioning every twitch in the survey's trend lines.

Gallup provides political junkies with their daily fix using the same methodology as the standard Gallup Poll, including live interviewers, up to five "call backs" to those initially unavailable, even separate calls to cell phones to reach those without land line phone service. So now those insiders who secretly peeked at the daily Rasmussen Reports tracking numbers, despite condemning its interviewer-free automated methodology, can satisfy their craving for data from a trusted source they can openly discuss.

Perhaps this is a good time to provide my fellow polling addicts with a user's guide and a reality check.

First, some background on how it works: Every night of the week, Monday through Sunday, Gallup interviews 1,000 adults in randomly selected households nationwide. Each night, pollsters identify approximately 400 self-identified Democrats or Democratic-leaning independents and 350 Republicans or Republican-leaning independents that say they are extremely, very or somewhat likely to vote (or who have already voted) in their state's primary or caucuses. For likely voters, Gallup asks about their presidential vote preference. For those who have already voted, pollsters ask which candidate they supported.

Since the nightly samples of primary voters are small, Gallup reports a three-day rolling average. The results reported on Thursday are based on interviews conducted Monday through Wednesday, the results on Friday are based on interviews from Tuesday through Thursday and so on.

Those who advocate this approach to polling claim two principal benefits. First, in theory at least, is the potential to monitor reactions to events on a daily basis. Second, and arguably more important, is the ability to amass a huge pool of data that can be sliced and diced retroactively to assess the impact of events. Did that unexpected gaffe make a difference? Compare 2,000 interviews conducted before the gaffe to 2,000 interviews conducted after, and you will know not only whether it mattered, but to whom and how much.

For all this promise, however, daily rolling-average tracking creates a few potential problems in the way consumers interpret the data.

First, rolling together interviews conducted over several days may reduce the random variation (technically "random sampling error") that comes with interviewing a sample rather than the entire population, but it does not eliminate it. The three-day average results reported each day have a "margin of error" of +/- 3 percent, which applies separately to the support measured for each candidate. That means, for example, that if the presidential race remains deadlocked for several weeks with one candidate leading by, say, five percentage points, the margin as measured could fluctuate over the course of a week from a ten point margin for the leader to a dead heat by chance alone.

Second, the three-day rolling averages smooth out the line in a way that creates something of an optical illusion. If we charted the results for each night separately, we would see the purely random up-and-down for what it is. But when we smooth out the lines, we sometimes forget that some of the movement up or down is still random. We assume that every twitch in the line is real and easily find narratives to explain what just may be statistical noise.

Third, for all the warnings from pollsters about how we need to roll together three nights worth of data to come to meaningful conclusions, consumers of the data inevitably try to guess at "last night's numbers."

But if we look at just one night's data, which for the Gallup Daily involves roughly 400 interviews for the presidential primary questions, the random error gets bigger (+/- 5 percent), and the built-in control of three-day smoothing goes away. So if, hypothetically, one candidate maintains a five-point lead (in reality) over the course of two weeks, we could easily see a move in the one-day numbers (as measured) from a 10-point lead to a dead-even race, or vice versa, at least once by chance alone.

Like it or not, we need more than a single night's interviews to confirm whether small shifts in public opinion are real. So patience is still a virtue, even with all the daily data.

By Mark Blumenthal