How to read polls

A discussion prompted a thought that there are lots of misconceptions about polling and what it is designed to do, which in turn leads to incorrect suggestions that polls are wrong and therefore useless. So independent on specific polling discussions about this year's elections, I thought it would be useful to think more broadly about how polls work and how we should approach reading them. I don't claim to be a mathematical expert, so others more steeped in stats and probability will have more to add than I do, and will probably correct some of my thoughts. But given that polls remain the best source of evidence to determine how an election race is going, I have invested a bit of time in understanding how they work, and why they sometimes have perceptions of being wrong, and why sometimes those perceptions are correct and other times they are incorrect.

The key lessons for me in reading polls are...

Polls =|= Punditry
Polls and punditry are different things. When people tell us how wrong the polls were in 2016, what they actually mean very often is that the consensus among political commentators was wrong. That is probably correct. The polls, when we look back on them, showed considerable uncertainty for a few reasons. First, they were quite volatile through the campaign. Second, neither candidate could break beyond about 45% average (at least not for very long) in swing states or nationally, so there were always large pools of undecided voters that created uncertainty. It is true that pundits probably didn't properly reflect that uncertainty in their commentary or in many of the models that they built. But a failure of pundits to properly analyse and reflect what polls are telling us is not a failure of the polls, it is a failure of punditry.

Polls are snapshots, not crystal balls
Polls measure sentiment now, not in the future. They cannot predict what events might occur to shift polls. If I'm running in an election and I'm 47-40 up three weeks out, and then I'm 45-44 one day out from election day, and then I lose 44-45 that doesn't mean the poll three weeks ago was wrong. Three weeks ago, 47% might have planned to vote for me, and then over the course of three weeks, they changed their minds. When we look at polls and try to predict what that means for the election, we are entering the realm of punditry - we can use our knowledge of the campaign and candidates to guess whether the polls might shift, but the polls themselves cannot tell us that. So in October 2016, polls in early October couldn't predict that Comey would reopen the Clinton investigation only to close it again a couple of days before the election. That fact, and the effect it had on the polls, doesn't make the polls in early October wrong.

Polls don't claim to be perfect
Polls have margins of error. Decent polls usually have MOEs of around 3%. Worse ones sometimes come out with MOEs of 5% or 6%. And those MOEs apply to each candidate's vote share, not to the lead. And even then, polls also have confidence intervals - usually 95% - which means the poll is saying that it is confident that the poll is accurate within the margin of error, 95% of the time. So a poll with a 3% MOE that shows Biden at 50% and Trump at 40% is actually saying that there is a 95% chance that Biden is somewhere between 47 and 53, and that Trump is somewhere between 37 and 43. So the 10% headline lead could actually be anything from a 16% lead to a 4% lead, with 95% confidence. And of course there is a 5% chance that the result might fall outside that.

Vote share matters more than leads
If it's election day, would you rather a 10% lead at 40% to 30%, with 30% still making up their mind. Or would you rather a 5% lead at 50% to 45% with 5% still making up their mind? Clearly the second is a better position to be in. The big takeaway from 2016 is that we should pay attention to the undecideds. Polls don't predict how they will vote. Again, we can use some punditry here to take a guess. For example, we knew in 2016 that the undecideds were (on average) older, more conservative voters who hadn't made up their mind. We could take from that, using what we knew about the campaign, that they might have been the type of folk who would have supported Ted Cruz, and who heard his Convention message to 'vote your conscience', or Republicans who flirted with never Trumpism. So punditry might have told us there was a good chance that undecided voters would break heavily for Trump. But polls can't tell us that, and they don't claim to. So if you have a poll that shows Candidate A getting 45% and Candidate B getting 35% but 20% of voters are undecided, the poll can't predict how that 20% will vote, and if they break three to one for candidate B, so that A gets 49.9% but B gets 50.1%, that doesn't make the poll wrong (even leaving aside the MOE) And that in most states in 2016, the undecideds was considerably higher than the margin between the candidates, and as it turned out, that vote broke for Trump. The lesson in 2020, I think, is that we should be looking at (a) how big is the undecided vote, and (b) what are its demographic characteristics. The size will tell us if it has the potential to change the outcome of the race, and the characteristics might help us to predict, or guess, how the undecided vote might break.

Averages > Individual Polls
Individual polls have individual issues. First, not all polls are created equally. Some polls are conducted using more reliable methodology. For example, in 1948, the polls showing Dewey beating Truman were basically conducted by media types contacting friends and friends of friends. They then weighted according to some demographic characteristics, but the fundamental flaw remained: they were mainly Republicans, their friends were mainly Republicans and their friends of friends were mainly Republican. If we were to put that poll beside a poll conducted using random dial, live interview polling, it would be nonsense to say they should have equal weight in our assessment. But even among high quality polls, they can get it wrong. They all have margins of error, and they all have that 1 in 20 chance of being wrong outside of the margin of error. So by averaging polls and looking at the average rather than placing too much weight on any one poll, we get a better picture of the overall race, because the effect of outlier polls is smoothed.

Look at the quantity and diversity of polling
Compare polls in Wisconsin in 2016 to 2020 and there's a massive difference. In 2016, Wisconsin was not perceived as competitive. So polling companies didn't invest money in polling it very much. So the polling average comes from about thirty polls conducted by about five or six pollsters over the course of the campaign. In 2020, there are, I would guess, over a hundred polls conducted by a much wider range of pollsters. That diversity and extent of polling suggests that the averages should be more reliable.


So with all of those limitations, are polls useless because there are so many caveats? No. They have considerable value. And when read with those caveats in mind can tell us lots about the state of the race. That is why campaigns, and not just media/political geeks, use them extensively. Because they tell them where to invest resources, what messages or decisions are working well or working badly etc.

I know there is a category of Trump supporter totally uninterested in any of this: the evidence doesn't support their claim that Trump is winning, and his messages are landing well. So the evidence has to be dismissed as wrong. Those posters have no interest in actually dissecting polls. But for others of us, who actually do want to follow the election over the coming few weeks, and want to understand what the polls tell us (and what they don't tell us), it might be worth a discussion about how polls actually work.
 
Just to add that the US electoral system makes it even more complicated and a solid couple of % lead in National polls does not mean a win in the election. What is happening in the swing states is where the election will actually be decided.
 
It allso depends on the people when asked who they would vote for telling the truth, which is not allways true, u would exit polls to be pretty accurate where people are asked after leaving the pooling station how they voted. Last Feb. GE showed thay can't be taken aeriously as their was a number of different results from the exit poll
 
Cdebru said:
Just to add that the US electoral system makes it even more complicated and a solid couple of % lead in National polls does not mean a win in the election. What is happening in the swing states is where the election will actually be decided.
There are polls and polls. You can bet that party polls in competitive states are a lot more detailed than national polls taken for biased news outlets. Personally, I like to quote an old Canadian PM, John Diefenbaker - "Dogs know what poles are good for"
 
in the US case one would have to ask are Republicans in particular either lying to pollsters or just refusing to answer pollsters and secondly are people treating the polls as a way of sending a message to the President but will still vote for him in the privacy of the booth.
 
How is the margin of error determined. Always wondered.
 
The classic "polling error" occurred in 1936, when a popular magazine called Literary Digest conducted a surviey of its readers by enclosing a questionnaire in the magazine. It asked their choice of President in the year - sitting President Franklin D Roosevelt or Republican Governor Alfred Landon of Kansas?

The response was overwhelming - Alfred Landon was the choice for President by 57% of those polled. He was confidently expected to win. The poll was novel enough to be reported in the national newpapers. The sample size was 2.4 million!

Of course Landon didn't win - FDR won in a massive landslide, with 60% of the vote, and winning every state except 2. The Electoral College vote was 538 - 8.

It seems so obvious now, it is laughable - if you were wealthy enough in 1936 to subscribe to, or purchase, a literary magazine, then you would probably not be bothered much by the Great Depression. Literary Digest readers probably all belonged to a very narrow social class.

At the same time the Literary Digest was making its fateful mistake, George Gallup was able to predict a victory for Roosevelt using a much smaller sample of about 50,000 people.
1936 was a key election in polling history, as it initiated its use as a tool of electioneering, for good or bad. Nice little article about it here:

 
livingstone said:
I know there is a category of Trump supporter totally uninterested in any of this: the evidence doesn't support their claim that Trump is winning, and his messages are landing well. So the evidence has to be dismissed as wrong. Those posters have no interest in actually dissecting polls. But for others of us, who actually do want to follow the election over the coming few weeks, and want to understand what the polls tell us (and what they don't tell us), it might be worth a discussion about how polls actually work.
There is an interesting article in today's IT by Pete Lund regarding Covid in which he makes the point that it is very difficult for people to accept evidence that contradicts their belief https://www.irishtimes.com/life-and...-is-wrong-we-need-to-change-the-conversation-
 
Cdebru said:
Just to add that the US electoral system makes it even more complicated and a solid couple of % lead in National polls does not mean a win in the election. What is happening in the swing states is where the election will actually be decided.
This is true but it can be overstated.

If Biden were leading Trump by, say, 4% nationally then I would say that fact alone is pretty meaningless and we really need to rely on state polls. But if Biden is leading Trump by 10% nationally, then the national poll is relevant. I've not made any predictions for 2020 yet, but my first prediction is this: If Biden wins the national vote by 10%, he is not losing the electoral college.

There are a few posters on here who act like they've hit on the third secret of fatima when they explain what the electoral college is, as if others don't know. But their pronouncements that national polls are useless is totally wrong. Like all polls, national have their limitations and their uses.
 
wombat said:
There is an interesting article in today's IT by Pete Lund regarding Covid in which he makes the point that it is very difficult for people to accept evidence that contradicts their belief https://www.irishtimes.com/life-and...-is-wrong-we-need-to-change-the-conversation-
Sure. And I recognise that even in myself. It's very easy to try and dismiss the poll that doesn't suit your preferred outcome.

We should all try to put ourselves into a more objective space of looking at what the totality of the evidence is telling us. And we can all struggle to do that.
 
Round tower said:
It allso depends on the people when asked who they would vote for telling the truth, which is not allways true, u would exit polls to be pretty accurate where people are asked after leaving the pooling station how they voted. Last Feb. GE showed thay can't be taken aeriously as their was a number of different results from the exit poll
Was there? As I recall, the exit poll overestimated FG by about 1.5% and underestimated SF by about 2%. Both very slightly outside the margin of error but only very slightly. FF and the other parties were pretty much spot on.
 
A long overdue and welcome thread, which I had toyed with starting too but appropriate it would be livingstone to kick it off.
well done sir

Think we should use this as a ground zero for polls

An appeal to all looking to have a good discussion not to get pulled down rabbit holes with inevitable "polls are meaningless" and "that poll is wrong" blanket statements
Also for me a least I'd like to keep as much punditry and even predictions as possible out of discussion
The conflation of polls and punditry plays in to hands of the "polls are useless ad meaningless" yahoos

Punditry is a bit of fun but has no real merit
No one can tell the future and we all have some degree of in built bias that will always chip in and erode hard figures

If anyone thinks a poll is wrong' please say where you think it is flawed so others can have a look and see if the alleged flaw has merit
 
livingstone said:
Polls have margins of error. Decent polls usually have MOEs of around 3%. Worse ones sometimes come out with MOEs of 5% or 6%. And those MOEs apply to each candidate's vote share, not to the lead. And even then, polls also have confidence intervals - usually 95% - which means the poll is saying that it is confident that the poll is accurate within the margin of error, 95% of the time. So a poll with a 3% MOE that shows Biden at 50% and Trump at 40% is actually saying that there is a 95% chance that Biden is somewhere between 47 and 53, and that Trump is somewhere between 37 and 43. So the 10% headline lead could actually be anything from a 16% lead to a 4% lead, with 95% confidence. And of course there is a 5% chance that the result might fall outside that.
Interesting and very complete OP. One thing - I'm not entirely sure about here though: as I understand it the confidence interval and the margin of error are the same thing, it's the confidence level that's slightly different, but still related.

The interval is the margin of error, expressed as a percentage, and reflects how sure you are that your sample size accurately reflects the whole population. The confidence level is how likely it is that the same poll, carried out over and over, would give the same results each time.

1602326391449.png

Difference between Confidence Level and Confidence Interval

Most of us would have used these terms and values in our statistical analysis and estimation. They sound similar and thus are also confusing when used in practice. While the purpose of these two are invariably the same, there is a minor and important difference between these two terms...
www.whatissixsigma.net www.whatissixsigma.net
 
shutuplaura said:
How is the margin of error determined. Always wondered.
It is not an easy explainer.

Polls have a few sources of error -
  1. selection bias in the sample e.g. sample does not reflect the population
  2. sampling error, intrinsic to it being a sample
  3. response bias, error, or deception
There are mathematical ways to handle 2, but for the others pollsters probably have estimation methods based on their own experience of what works.

www.pewresearch.org

Understanding the margin of error in election polls

Some statistical rules of thumb that smart consumers might think apply in polls are more nuanced than they seem. As is often true in life, it’s complicated.
www.pewresearch.org www.pewresearch.org

A rough rule of thumb is that the margin of error due to sampling error (2 above) is proportional to 1 divided by the square root of the sample size (1/sqrt[n]). So quadrupling a sample size halves the margin of error etc But quadrupling a sample size quadruples the cost, so it is a tradeoff.

Also, the sampling error depends on the sample size, not the percentage of the population - after a certain point the improved margin of error from larger samples fades away and diminishing returns sets in. You will find most political polls do not go far about 1,000 or 2,000 where the size is fairly optimal.
 
livingstone said:
Was there? As I recall, the exit poll overestimated FG by about 1.5% and underestimated SF by about 2%. Both very slightly outside the margin of error but only very slightly. FF and the other parties were pretty much spot on.
Them 2 as far as i remember their was a number of others different. the margin of error for that poll was 1.3%, does the margin of error not depend on the amount of people polled
 
Hmmmm I know nothing , or certainly not enough of US politics to make a judgement call on its polls ....but Ireland......

I doubt anyone would disagree that the main media outlets RTE , IT and Indo never mind Newstalk (who supply the vast majority of Radio News) have given Leo (and his media advisors who have been plucked from the above media outlets) their support and been fairly consistent in promoting his FG as the best option for Government.

And from Jan 2018 to Dec 2019 , they consistently told us that FG were in the low to mid 30s and never falling below 25% ...for 2 solid years thats what they said the polls showed .....yet in Feb 2020 ...just weeks after telling us FG were at 25% , The actual Electorate gave them just 19% (BTW have seen absolutely no media commentary on this FACT)

Opinion polling for the 2020 Irish general election - Wikipedia

en.wikipedia.org en.wikipedia.org

Do you really think the electorate in Ireland are so volatile that a swing of over 6% can happen (after the polls being consistent for over 2 years ) in under 4 weeks ? Or do you think its more likely the polls were wrong? And possibly deliberately wrong?

If you've ever been polled you'll know just ow easy it is to frame a question in order to receive an answer that the polling company can frame as they wish.Its even easier , if you dont answer as required for them to simply not contact you again.
Its equally as easy to filter respondents before hand.

Then look at the connections between the politicians and the polling companies , and the amount of work syphoned their way

And finally , the idea , that Governments and Political parties, who spend - even in Ireland , millions and millions on spin promotion , spend years targeting specific constituencies and researching focus groups would somehow, with their connections be too honourable to attempt to use polls to reflect opinion rather than lead it.

The only way anyone could believe that is to be utterly naive or ideologically partisan, (and there is as much of that about in the centre as there is on the fringes) .

I have very very little faith the polls , and Im suprised , even if you suspend common sense , then following the last US election , Brexit and our own election that anyone else does either
 
