observatoire des sondages
    All the versions of this article:
  • français
  • English

Criticism of polls and surveys in American social science

Tuesday 24 March 2015, by Howard S. Becker

Research using polls and surveys for primary data has been criticized as long as those instruments have been used in American research for commercial, political, and academic and scientific research. Two crucial episodes of this criticism suggest the larger organizational and professional/political issues involved.

Academic survey research, and the related enterprise of commercial and political polling meant to predict election results and consumer behavior, have been closely connected. We can better understand today’s polls if we see them as part of a larger movement, designed to create a “scientific” social science, whose two connected but distinguishable branches collaborated in the shared effort to legitimate a style of research that came to be known, variously, as survey research or polling. One branch grew out of the interest of businesses, and the advertising agencies they supported, in finding out what their audiences and customers wanted so that they could make larger profits. The other grew out of the statistical tradition in sociology which, starting with perhaps Quetelet and continuing through Durkheim and then, in the United States, sociologists like Ogburn, wanted to prove that sociology and related disciplines studying contemporary society were “real sciences” like physics and chemistry, capable of producing demonstrably true generalizations and laws by using the rigorous methods of measurement and statistical and mathematical analysis of those sciences.

Blumer versus Stouffer: The American Soldier

An early stinging, comprehensive and profound attack on survey research and opinion polling in the United States came from one of sociology’s great critics, Herbert Blumer.

In December of 1947, the American Sociological Association held its annual meeting in New York City. Blumer gave a talk titled “Public Opinion and Public Opinion Polling”. Imagine the scene. Blumer, a large and imposing figure, a former professional player of American football, who spoke in an impressive oratorical style, was an important student of Robert E. Park and George Herbert Mead, and is often considered one of the founders of the Chicago School of sociology. He was and had been for many years professor of sociology at the University of Chicago. His typical approach to any topic he spoke about was to describe in general terms the ways most scholars approached the topic (whatever the topic was) and then to say that they were all wrong. After explaining in detail the failures of each approach, he would announce the “correct approach,” invariably a position that could be deduced from the writings of Mead, and that situated the specific topic in a more comprehensive view of society and social life.

When Blumer spoke, people listened. Any speech of his made news in then small world of American sociology. On this occasion, he demolished, in his methodical, ponderous way, the theory and practice of the study of public opinion, as that had crystallized in the United States, and particularly denounced the methods and results of public opinion polling, asserting that the conceptions of the public and of public opinion embedded in this style of work were faulty and its results inevitably erroneous. Pollsters had no well-defined conception of public opinion, and simply identified it as the results of their interviews. He, on the other hand, identified “public opinion” as the collective understanding of a topic developed through discussion and debate in and between organized groups—not as the adding up of individual opinions, as polling methods assumed. If you accepted that understanding of the topic, he made clear, individual interviews, of the kind made for polls, told us nothing about public opinion.

When he finished his presentation, two discussants made academically conventional criticisms of his paper. Then from the floor, Samuel Stouffer, slightly younger than Blumer, a PhD. from Chicago, a professor at Harvard, and a well-known proponent of the methods and style of work, and of the theories underlying them, that Blumer had just so vigorously and devastatingly denounced, made a rebuttal. No one wrote down his exact words, but someone who was there told me that Stouffer shocked the assembled sociologists. not by disagreeing with Blumer, which everyone expected, but by unexpectedly denouncing him as “the gravedigger of American sociology.” Blumer’s critique had clearly stung and promised to interfere with something important Stouffer was engaged in. What caused such a distinguished Harvard professor to erupt like that? What entity was Blumer supposedly digging the grave for?

Stouffer had spent the WWII years running the largest survey research operation ever conducted to that time. Under the auspices of the U.S. Army, the research organization he created wrote questionnaires on all sort of subjects of interest to the Army’s commanders, pretested them, collected completed forms from half a million soldiers, analyzed the results and presented them in written form to military commanders. They studied problems related to military morale, to the eventual demobilization of the soldiers, and many other topics. The operation had without doubt been a grand success, highly approved by the top general, George C. Marshall.

For Stouffer the importance of this work went far beyond Marshall’s good opinion. He aimed at something far more important, nothing less than the future of American sociology and social psychology. He wanted to transform these fields into what he, and many others, thought of as “real science,” which in his view meant measuring important variables and using advanced statistical methods to analyze them in order to test rigorously derived hypotheses in a definitive way. He thought the work he had done during the war would let him do that, demonstrating the feasibility of the methods and their efficacy in producing real science.

He had secured funding for a large team to produce the four volumes of The American Soldier, substantive and methodological essays based on the survey data collected for the Army, which used the material (“the data”) to explore genuinely sociological problems, both theoretical (the concept of “reference group” was one of the most prominent such results) and methodological (essays reporting such analytic inventions as the Guttman scale and Paul Lazarsfeld’s latent structure analysis).

Lazarsfeld and Merton edited a fifth volume, Continuities in Social Research : Studies in the Scope and Method of The American Soldier, not officially part of the project, intended to demonstrate the purely scientific uses one could make of this vast store of material so conclusively that this style of work would eventually dominate social science.

The loosely organized work group intended to make unarguably clear that this kind of research—well worked out theoretical premises tested and proved by elegant quantitative analyses of carefully measured data, in this case data for the most part on attitudes—would prove to skeptics in their own field but, more importantly, in the physical and biological sciences, that sociology was a “real science”. This style of work was already entrenched in the Harvard and Columbia departments of sociology and Stouffer, Lazarsfeld, Merton and their colleagues wanted to make it what Thomas Kuhn later called the “normal science” of the coming era of American (and therefore world) sociology. They wanted their five volumes to crush any opposition. And they wanted to show legislators and the natural scientists who were controlling the creation of the National Science Foundation that social science deserved its share of government research funding.

Later work, designed to further prove their contentions, came from those departments. From Harvard: Stouffer’s Communism, Conformity, and Civil Liberties, which dealt with the fear of communism among the American public and its leaders, and their concerns with the erosion of civil liberties. From Columbia: Merton’s Mass Persuasion, a study of a radio campaign to sell “war bonds”, and empirical studies of voting (by Berelson and Lazarsfeld), and medical practice (Coleman, Katz and Menzel, The Student Physician, etc.).

Questionnaires, which collected data in a way amenable to measurement and numerical analysis, especially when used on a large scale in what came to be called survey research, promised to do that. Questioning large numbers of people with standardized instruments let you measure, and then control statistically, such variables as age, sex, class, religious affiliation and political preferences, which could not be controlled experimentally. Large scale surveys made it possible to evaluate sociologically meaningful propositions in a manner they considered demonstrably rigorous.

Surveys cost a lot. It was never easy for the ambitious sociologists who nursed this vision to do the large surveys that would accomplish that goal. WWII provided such an opportunity and the growing demand for reliable information on consumer behavior another. As soon as the U.S. entered the war in 1941, the Army organized and paid for an unparalleled series of large scale surveys of topics like military morale, designed to provide reliable answers the military organizations to practical questions wanted to answer in order to perform their military mission successfully. But the people who were doing the surveys, a team of academics led by Samuel Stouffer, saw in this collection of 123 large surveys of military personnel on a variety of topics an unparalleled collection of data they could use to develop and test scientific propositions.

They were joined in this by an up-and-coming sociological theorist, Robert K, Merton, who joined forces with the similarly influential methodologist Paul Lazarsfeld, both professors at Columbia University, where the latter had already implemented his idea of a research institute centered on the use of survey data to test sociological ideas.

Many thought, and still think, that they succeeded, and the proof can be seen in the pages of the major American sociological journals, where studies in this style provide the vast majority of articles.

But the victory was never complete. Blumer had laid down a strong counter-argument, which questioned the basic premises of the approach, and other schools of thought continued to contest the matter. Substantial numbers of sociologists never agreed that survey data or polling interviews had or would ever create science of the kind these prophets promised. And Stouffer, in 1947, before the campaign had really begun, and fearing that Blumer would convince too many people, uttered his frustrated cry. And then the whole enterprise lost substantial ground as a result of the 1948 presidential election in the United States.

The 1948 Election Fiasco

The first attempts at assessing the state of public opinion by means of what came to be called “polls”was the survey conducted by a popular magazine called Literary Digest, designed to predict the winner of the 1936 presidential election in the United States. This poll spectacularly misread the state of public opinion, announcing that Alf Landon, an obscure Republican from the state of Kansas, would win over the incumbent president, Franklin D. Roosevelt. Which, of course, did not happen. That led to criticism of the methods used. We might well say that criticism of polling method has proceeded simultaneously with the development of the method itself [1].

George Gallup’s new American Institute of Public Opinion, which had correctly predicted the same election, moved into the ecological niche created by the Literary Digest debacle. It did well until 1948 when, like the other major organizations (Crossley and Roper) in the still relatively new industry of polling, it predicted that the Republican candidate, Thomas Dewey, would defeat Harry Truman. Which also did not happen.

This event caused a serious reconsideration of the many problems of doing accurate polling and making predictions that withstood the test of reality. The credibility of the whole operation was being openly doubted. This failure of polling and survey methods affected both the commercial interests of the big polling organizations and the aspirations and continued existence of academic research organizations and individual scholars.

The predictive failure of the election polls had important consequences. In the time between 1936 and 1948, polling had become a large business, which made its profits by doing surveys designed to help commercial enterprises—manufacturers, advertisers, radio networks, Hollywood studios—guess what the buying public would respond to in a way that woul make money for them. Election studies had become what they have remained [2], the one kind of survey study whose accuracy can be assessed by comparison with the events it is meant to predict. The accuracy of commercial surveys could never be demonstrated as effectively, because too many other variables affected the behavior they were supposed to measure: public response to an advertising campaign or a new product. With an election poll, you eventually knew that the poll was correct (or not) when and if the election results coincided with its predictions.

Unless, of course, they didn’t. As they did not in 1948. The spectacular failure of survey methods in that election threw the whole commercial enterprise into doubt and pollsters were anxious to rescue their businesses from this potentially fatal failure. The spectacular failure provoked an alarmed response from organized social science as well. The two groups cooperated in a quickly organized committee of inquiry, led by the Social Science Research Council (SSRC), which included representatives of both groups.

The SSRC mainly represented the interests of the group which played a major part in what followed: the group led by Stouffer, Merton and Lazarsfeld that Blumer had attacked, the group so anxious to prove that social science was a real science. They wanted to make that claim without being laughed at by the chemists and physicists. Psychology had tried to do that, as it still does, by imitating the experimental methods thought to be responsible for the success of the “hard sciences”. But everyone soon could see that the social sciences could not use experimental methods, for reasons both practical and ethical. So they were looking for techniques that would let them talk about the great subjects of their disciplines while approximating laboratory methods that controlled all the variables but the one whose effect you wanted to measure.

Survey research looked like the method that would replace experimentation as a demonstration of how “scientific” both commercial and academic social science research was or, at least, could be if done properly. But the polling failure of 1948 made that claim difficult to demonstrate.

So the SSRC organized a committee to rapidly produce a report that would find out what went wrong and make recommendations so that no such failure would ever happen again to threaten the claims these groups made to do real science. The committee included the leaders and other representatives of the major commercial organizations—Crossley, Roper and Gallup—and experts who had worked on The American Soldier.

The committee produced no surprises, because all the difficulties that had caused the failure had been well known in the trade for years. Quota sampling procedures (which the committee criticized) instructed interviewers to get a certain number of people from each of a number of categories but left them free to find those people any way they could. It was much easier to do than probability sampling, but made it impossible to use the mathematical reasoning that made generalizing from small probability samples possible. Similarly, well-known problems in questionnaire construction led to known sources of uncontrolled error. For instance, varying the order in which interviewers asked questions produced, then as now, substantial errors, and everyone in the business knew that. Furthermore, interviewer cheating— filling out the forms for people who they had never talked to—produced further errors. The committee predictably discussed errors like these and recommended further research designed to solve these problems. Some research was done, but the recommendations for dealing with these errors, costing more, were and still are considered by most researchers and pollsters as “impractical”.

Responses to These Problems

Shortly after the SSRC committee report appeared, Oskar Morgenstern, the well-known economist who, among other things, was the co-inventor of game theory, published On the Accuracy of economic Observations [3], a definitive compilation of known sources of error in economic data, many of them found in data used in survey research and polling. Many are obvious: the dangers of clerical errors of copying, misreading handwritten data forms, misprints and other kinds of editorial error in published sources. Morganstern demonstrated gross errors in data economists used when they evaluated hypotheses by relying on very small differences in quantitative results, and suggested that instead of using standard tests of significance social scientists would do better to insist on differences of at least 10% before they took a finding seriously. A recent article concluded that nothing has really changed since Morganstern wrote this book. The sources of known error still require the kind of skepticism he called for sixty years ago.

Critics continue to criticize but no one does anything about these problems. Years ago I met the director of a survey research center in Canada and, feeling mischievous, began to ask him how he dealt with the problem of order effects in survey interviews. Did he use, as critics recommended, two alternate versions of the interview form, which would allow him to evaluate the size of this error. No, he said, he didn’t do that. I persisted, and asked if he used alternate forms to evaluate the contribution of response sets—a demonstrated tendency that some people to have to agree with any strongly worded statement, no matter what its content was, which could also be dealt with by using two alternate forms with statements worded strongly in both directions. He said he didn’t do that either. I mentioned a few other things and he finally said I could stop, he didn’t take any of these remedial actions. He said that I surely knew that every time you added another pair of variants to the forms, you doubled the number of forms required. (For the mathematically inclined, the principle involved is that each new pair of possibilities increases the number of required forms by a power of 2. So two alternate forms becomes four (22=4) when you add a second pair of alternates, and becomes eight (23=8) when you add a third pair.) And, he reminded me, his center was in Canada, and so all questionnaires had to be in two languages to begin with, which added still another power of two. “It’s not practical”, he said. And he was right.

What do pollsters and survey researchers do about these criticisms, which have never stopped? I’m a little ashamed to say this but, though the criticisms persist, they simply ignore them as much as they can. In most cases, that means that they can ignore them completely. Because the consumers of their data and research want those results, which they use to do what they have to do: write articles, sell washing machines, run political campaigns. They will overlook even the grossest errors if it is at all possible. And they don’t want to pay what it would cost to get rid of them.

Kuhn has shown that the only time paradigm changes occur is when practitioners try to “solve” the problems and, in the course of doing that, begin to work in different ways, so that the unity of the field collapses and they can no longer work in a unified way to solve shared problems. Despite all the research done on to ameliorate the problems, they do what working scientists usually do when confronted with difficulties in their paradigmatic ways of going about their work:

Some history: Academic sociology

In fact, the opposition between two styles of work—the one championed by Stouffer and others, and the other led by an assortment of people who had commitments and interests that led them to other ways of working, mainly quasi-ethnographic—was an old one. And Stouffer was right to worry. His victory was never complete.

From the beginning of sociology in the U.S., there was a tension between gathering information in a form suitable for statistical treatment and gathering it in some “richer” format that would allow for the acquisition and use of more detail. There were various versions of this fight, but it was often referred to as “statistics” vs. “case study”, and the idea of a case study had some sort of reference to a tradition of doing case studies of social welfare cases.

As sociology tried to be accepted as a serious scholarly discipline in the academy, it felt it had to get rid of the stigma of social work, the idea that the job of sociology was to help people live a better life, overcome the problems of urban living, etc. Instead, the representatives of sociology looked for ways to be more impartial, more attuned to gathering factual material that would report accurately and without bias on the state of the social world.

This search took various forms. On the one hand, the state in the many forms it takes in the U.S.(federal, state, city, etc.) collects information and makes it available in various forms, usually tables, rates, etc. (Census, health and crime statistics, educational statistics, etc.)

Because of the administrative origins of these concerns [4] these data are usually not collected with the testing of scientific hypotheses as a first concern. In the U.S., the Census tries to satisfy, first, the constitutional requirement to count people in order to apportion representation in the Congress; then for a host of administrative planning needs; and in large part to provide information to business enterprises looking for data to use in planning their location, output, etc. Government health organizations collect health data, banks and other financial institutions collect economic data, etc. They all do it for their own purposes, not for ours.

But neither the state or any other organizations collect information on many subjects people and enterprises would like to know about: preferences among commercial products, for instance, or the many topics sociologists are interested in that aren’t the object of state concern: marital happiness, or racial attitudes, children’s ambitions or family histories.

So researchers have always had to find the money to pay for research. In the first instance, money to pay their salaries to that they can take time away from teaching and other university duties. The early years of American anthropology, for instance, contained many instances of “summer” research. As soon as the school year ended, the professors went to their field sites and learned all about what Native Americans did in the summer; the rest of the year was often unknown. Similarly, they had to find money to pay students to gather data or do small projects.

Those interested in quantitative research had to find ways to gather the large amounts of standardized data that typically demanded. Occasionally a large foundation would pay for such a program of data gathering and analysis, but this was always on a project-by-project basis. It’s an uneconomical way to do things.

A better way, as Paul Lazarsfeld saw and recommended (but he wasn’t the only one), was to create a permanent data gathering/analysis organization. Such an organization could have a permanent staff of interviewers, analysts and supervisors and interpreters of the results. But such an organization needed continuing financial input. Someone had to pay the bills. Sometimes, as with the Survey Research Center at the University of Michigan in Ann Arbor, a permanent client would commission ongoing surveys on a topic of continuing interest, as the Federal Government’s financing of surveys of consumer confidence; the National Opinion Research Center at the University of Chicago; and the Bureau of Applied Social Research at Columbia University in New York, which for a considerable time got continung support from the Columbia Broadcasting Company (whose CEO was a social scientist), as did a similar smaller scale and non-university based organization made up of University of Chicago anthropologists and psychologists (Social Research Inc.).

A major component of this change had to do with the increasing financial support that federal government was giving to scientific research during the pre- and post-war periods which culminated in the establishment of the National Science Foundation. At first, the NSF’s brief did not include social science at all. When social science was finally incorporated, it was with hesitation and apprehension—fears that the Congress would confuse it with socialism, for example, and worries that the hard sciences would accuse them of not being scientific at all— and the bureaucrats in charge of social science at NSF have worried (no doubt with reason) about supporting anything that didn’t look as much as possible like a somewhat caricatured version of how the natural sciences actually operate.

University based research organizations were able, at times, to engage in “basic research”, designed to test scientific theories and to develop adequate research methods. They gave scientific legitimacy to the methods they used by trying to improve them through the trial and error methods typical of normal science.

On occasion, the federal government would support large scale continuing research funds for such an enterprise as the General Social Survey, which (with continuing—28 years—substantial funding from the NSF) has investigated with long-grunning comparative studies only a university based, government funded enterprise could do all kinds of theoretical and methodological questions.

Stouffer and his colleagues hoped to settle these questions and get rid of these doubts about whether social science was or was not science. The American Soldier was meant to be the definitive weapon in this war to establish sociology and social psychology as “real science” and thus assure adequate financial support, more or less independent of commercial pressures.

Blumer’s attack, from within the profession, did not in the end prevent much of what Stouffer, Lazarsfeld and Merton wanted to see happen. But it did signal that their triumph would be partial, and that they would never control the field in the total way they had hoped for and expected.

There were, of course, other sources of criticism, of a more technical kind

Some history: political and commercial polling

Political and commercial polling began in the ‘20s and ‘30s and, as in France, had its origin in providing information to profit making enterprises. The basis for of this symbiosis, as Garrigou explains, is that election polls provide the only “reality check” on the results of inquiries using the method. Because the polls measure what is “supposed to happen” when the election does occur, the election tells you whether the prediction was accurate. This is never possible for commercial polling, which tries to find out what people will buy, what they want to buy, what they don’t like about what they don’t buy, etc. Too many other things intervene for answers to questionnaires to be other than very bad indices of what people will actually buy or the reasons for their doing what they do.

So the election polls provide the warrant of accuracy for the unmeasurable accuracy of the commercial polls. But the commercial polls are where the money is. Election polls, paid for by news sources (newspapers, radio and TV, etc.), aren’t profitable, and in fact are often done at a loss. But they convince advertisers and merchandisers that the information they get about their more mundane business concerns is dependable. The commercial polls also help pay for the permanent staff that provides continuity for the research activities, commercial and political.

In the U.S. the field at first consisted of well known polling organizations like Gallup, Roper, and Crossley. Later many competitors entered the field, looking for profitable niches the big guys had overlooked or which were not profitable for them. More locally oriented polling organizations, like the Field Poll in California found such geographical niches, the size of the country allowing this kind of development.

Large commercial organizations worked on problems of method only as much as they had to to guarantee legitimacy to their clients, who were skeptical advertising men and marketers.

Over a relatively short time, a very effective symbiosis grew up between the commercial research organizations and the university based ones. This was probably forged at the time of the crisis caused by the 1948 U.S. election polling disaster, when all three major polls predicted incorrectly that Dewey would defeat Harry Truman for president. This created a terrible problem for the commercial pollsters, whose chief warrant to the purchasers of their services had suddenly disappeared. Did their failure to predict the election correctly mean that they couldn’t be trusted at all?

Howard S. Becker
Sociologue

[1Peverill Squire. « Why the 1936 Literary Digest Poll Failed », Public Opinion Quarterly, 52, 1988, p. 125-133.

[2Alain Garrigou, L’ivresse des sondages, Paris, La Découverte, 2006.

[3Oskar Morgenstern, On the Accuracy of Economic Observations, Princeton, Princeton University Press, 1950.

[4Cf. Alain Desrosières, La politique des grands nombres. Histoire de la raison statistique, Paris, La Découverte, 1993.