As with several other states in recent months, Pennsylvania's primary race is producing some unusually divergent polling results. In just the last week we have seen surveys showing leading by 20, up by 3 and everything in between. But for all of the variation in both the results and the methodologies used by the pollsters that reported them, one statistic has been relatively constant. As noted two weeks ago by my colleague Amy Walter, Obama's percentage of the vote has been less variable, typically falling somewhere between 40 percent and 45 percent.
The best way to consider a big mass of data, as my first statistics professor used to say, is to draw a picture. So consider the chart below, which plots the results of each Pennsylvania poll fielded since the March 18 speech by Obama on race and the Jeremiah Wright controversy. The darker blue points represent polls fielded all or in part over the last week, since news broke about Obama's controversial remarks at a California fundraising event.
The wide spread in the cloud of points illustrates the key issue, also noted this morning by NBC's Chuck Todd: Obama's range is more consistent (between 37 percent and 45 percent), "while Clinton's number is all over the map" (between 40 percent and 57 percent).
Dots plotted near the lower left corner of the chart have a bigger undecided number, while those closer to the upper right have a smaller number of undecideds. So as the undecided percentage gets lower, Clinton's support gets higher.
It is worth noting that there are many differences in the methods used by the pollsters active in Pennsylvania. Some use live interviewers, others use an automated "interactive voice response" (IVR) method. Some interview for as long as four to five days with repeated attempts to contact unavailable respondents, while others complete as many interviews as they can in one night with no "callbacks." Some sample randomly generated telephone numbers, others sample from lists of registered votes. And of course, the "likely voter" screens vary. As a result of all of these factors, the demographic and geographic compositions of the various poll samples may differ in ways that are not obvious from the horse race results.
But if we can set those concerns aside for a moment, we ought to consider why the "undecided" result varies as much as it does among pollsters, and why Clinton's support in Pennsylvania appears to rise as the undecided percentage falls.
The large variation in undecideds is not unusual. Ultimately, the size of the undecided category can depend on how hard the pollster "pushes" uncertain voters for a decision: Does the question offer respondents "undecided" as an option? Does it include a follow-up probe asking uncertain respondents how they lean? Are interviewers trained to push for a decision -- repeating the candidate choices as necessary -- or do they immediately take "I'm not sure" as an answer? Does the automated question pause a few seconds before offering "undecided" as a choice? All of these mechanics can help push respondents harder for an answer.
And what does it mean that undecided respondents seem to gravitate to one candidate when pushed?
The most likely explanation is that uncertain voters consider Clinton a safer choice and tend to opt for her rather than Obama when pushed. Obama has long been perceived by Democrats as the candidate best able to bring change to Washington, but Democrats also agree that Clinton has more experience. The combination of the Wright and "bitter" controversies surrounding Obama may be giving some voters pause, and the strategy of the Clinton campaign appears directed at maximizing that sense of uncertainty. This pattern creates the possibility that the bulk of the remaining undecided voters may "break" to Clinton between now and primary day.
One complication here is that the most extreme results on the chart above come from just four pollsters: SurveyUSA, which typically reports a very small undecided number and a bigger-than-average Clinton lead; Public Policy Polling, which typically reports a double-digit undecided number and better-than-average results for Obama; and the American Research Group and InsiderAdvantage, which manage to contribute results at both extremes. All except American Research Group use an automated methodology.
Remove those four pollsters from the chart and the "all over the place" spread in the Clinton percentage largely disappears. On the seven remaining polls in which 6 percent to 8 percent are undecided, Clinton leads Obama by an average of 50 percent to 41 percent; on the 10 remaining polls with an undecided of 9 percent to 18 percent, Clinton leads by an average 47 percent to 41 percent. So the wide variation in the Clinton's percentage is probably about more than than just how hard the pollsters push respondents for a choice.
Some may also be tempted to speculate about the so-called Bradley-Wilder effect, in which polls in the 1980s and early 1990s tended to underestimate support for white candidates with black opponents. The theory was that the fear of "social discomfort" made some respondents withhold their true preference if they thought it would create tension in the interview. But the contradictory evidence here is that polls conducted with an IVR methodology -- which has no live interviewer -- are contributing responses at both ends of the chart above.