As the U.S. presidential election draws near, news outlets and social media platforms are awash with data from public opinion surveys. How do pollsters ascertain which contender holds the lead in key states or among specific demographics? Or what matters most to the approximately 264 million eligible voters across this vast nation? In short: How do pollsters function? At Emerson College Polling, we oversee a dynamic survey operation that, like many others, has constantly adapted to keep pace with evolving trends and technologies in survey science. In the early days of survey research – roughly a century ago – data was predominantly gathered via mail and in-person interviews. This, of course, is no longer the norm. In the survey industry's infancy, participating in a poll was a novelty, leading to high response rates. Today, we are constantly besieged with survey requests through email, text messages, online pop-ups, and phone calls from unrecognized numbers. With fewer landlines, busy parents balancing work and family, and younger adults who seldom answer calls, preferring text communication, engaging participants has become considerably more difficult. This behavioral shift reflects the emerging challenges of reaching diverse groups in contemporary survey research.
In the simplest terms, polls and surveys consist of two elements – selecting whom to contact, and making contact in a way that is likely to produce a response. These components are frequently interconnected. In the 1970s, following the widespread adoption of household telephones in the U.S., survey operators embraced a random-sampling technique known as random digit dialing, where survey designers would select the area codes they aimed to reach and live operators would randomly dial seven-digit numbers within that area code. By the 1990s, pollsters began to move away from random digit dialing, which was time-consuming and expensive because the random selection process often yielded phone numbers that were disconnected or unsuitable for opinion surveys, such as businesses or government agencies. Instead, pollsters started using registration-based sampling, where public voter registration records were used to create the lists from which respondents were randomly selected. The details in these and other related public records, such as those outlining gender, age, and education level, allowed for a refinement of random sampling called stratified sampling. This is where the main list was divided into subgroups based on these various traits, such as political affiliation, voting frequency, gender, race or ethnicity, income, or education level. Survey administrators then selected randomly from these subgroups in proportion to the overall population. So, if 40% of the general population hold college degrees and 60% do not, a poll of 100 people would randomly choose 40 people from the list of those with a college degree and 60 from the list of those without. Other advancements in reaching respondents emerged late in the 20th century, such as interactive voice response, which did away with the need for live operators. Instead, automated systems played recorded questions and logged the verbal replies. In 2000, internet-based polling also started to gain traction, with participants completing online questionnaires.
Over the last two decades, the rise of mobile phones, text messaging, and online platforms has dramatically altered survey research. The traditional gold standard of using only live operator telephone polls has become practically outdated. Now that callers are displayed on phones, fewer people answer calls from unknown numbers, and fewer still are willing to discuss their personal views with a stranger. Even the once-standard random sampling has given way to a nonprobability sampling method based on increasingly specific population proportions. So if 6% of a population are Black men with a certain level of education and a certain income level, then a survey will aim to have 6% of its respondents match those characteristics. In quota sampling, participants may not be selected randomly but rather chosen because they possess specific demographic attributes. This method is less statistically robust and more prone to bias, though it may produce a representative sample with relative efficiency. In contrast, stratified sampling randomly selects participants within defined groups, minimizing sampling error and providing more precise estimations of population characteristics. To help polling operations find potential respondents, political and marketing consulting companies have compiled voter information, including demographic details and contact information. At Emerson College Polling, we have access to a database of 273 million U.S. adults, with 123 million mobile numbers, 116 million email addresses, and almost 59 million landline numbers. A newer method pollsters are using to reach respondents is called river sampling, an online method where individuals encounter a survey during their regular internet browsing and social media use, often through an ad or pop-up. They complete a brief screening questionnaire and are then invited to join a survey opt-in panel whose members will be asked to participate in future surveys.
Our polling operation has employed various strategies to contact the over 162,000 individuals who have completed our polls this year in the United States. Unlike traditional pollsters, Emerson College Polling does not rely on live operator data collection outside of small-scale tests of new survey methods to assess and enhance the efficacy of various polling approaches. Instead, like most contemporary pollsters, we utilize a mix of methods, including text-to-web surveys, interactive voice response on landlines, email outreach, and opt-in panels. This approach allows us to reach a broader, more representative audience, which is essential for accurate polling in today's fragmented social and media environment. This varied population includes younger individuals who communicate through different channels than older generations. When contacting people in our stratified samples, we account for differences between each communication method. For example, older people tend to answer landlines, while men and middle-aged individuals are more responsive to mobile text-to-web surveys. To reach underrepresented groups – such as adults ages 18 to 29 and Hispanic respondents – we utilize online databases that they have voluntarily joined, understanding that they may be surveyed. We also leverage information about whom we sample and how to compute the margin of error, which gauges the precision of poll results. Larger sample sizes tend to be more reflective of the overall population and therefore result in a smaller margin of error. For example, a poll of 400 respondents typically has a 4.9% margin of error, while increasing the sample size to 1,000 reduces it to 3%, offering more precise insights. The objective, as always, is to provide the public with an accurate representation of the public's opinions on candidates and issues.