Yesterday’s Washington Post featured two articles arising from a survey of racism.  “The Color of Disaster Assistance”, by Richard Morin, led off with these provocative statements:

Americans are more willing to provide extended government assistance to white victims of Hurricane Katrina than to African Americans and other minorities — particularly blacks with darker skin.

Overall, the "penalty" for being black and a Katrina victim amounted to about $1,000 . . .

The claim that Americans are, to be blunt, racists derives from an online survey of 2300 individuals that tested respondents for “subconscious racial bias”.

This is a disturbing allegation and, since I’m a professional statistician, I thought I’d take a deeper look at this survey and the findings based thereon.

The thing to do after reading the news stories would be to peruse methodological and other statistical documentation regarding the survey.  Unfortunately, none is provided.  The companion article, “Natural Disasters in Black and White”, by Shanto Iyengar of Stanford University and Richard Morin, provides a selective summary of the survey results along with a bare-bones description of the procedures used to gather responses, but that hardly qualifies as adequate statistical and methodological documentation.

However, enough information is found in the two news articles to conclude that the survey is fatally flawed, the results virtually worthless, and the claim of widespread subconscious racism among Americans unsupported.

The key idea to keep in mind as I walk through the reasons for my assessment is that, in order for a survey to provide a sound foundation for making valid inferential statements about a larger population, the survey respondents must constitute a sample that is representative of that larger population (known as the “universe”).  The statistically accepted method for assembling such a representative sample is random selection.  That is, the individuals to be included in the sample must selected randomly from the universe of interest.

For the survey at hand, the universe is all Americans.  In order to permit formulation of valid inferences about Americans in general, the sample must be drawn randomly from the universe of Americans.  To be drawn randomly means that every individual in the universe has an equal chance of being included in the sample.  If a random sample is drawn, then it will approximately match the demographic characteristics of the universe.  The age-sex-ethnic/racial composition of the sample will be very close to that of the American people.

With that as background, I move on a critical assessment of the Post survey.

Were survey respondents drawn at random from the universe of all Americans?  No, they were not, for this was an “online” survey.  Respondents had to choose to navigate to a website and answer the questions.  This generates what is known in statistical parlance as a “self-selected sample”.  Respondents were not chosen by a random selection process but rather chose themselves for inclusion in the survey.

In self-selected samples, survey respondents represent no one but themselves.  The responses speak only for those who made them.  That fact alone fatally flaws the survey and makes it worthless as a basis on which to make inferential statements about Americans.  Yet both news articles repeatedly do exactly that.

Further confirmation that survey respondents do not represent the American population is found in the more detailed of the two articles which mentions some of the demographic characteristics of the survey sample.

[T]he sample was skewed heavily in the direction of Democrats and liberals — only 12 percent of the participants identified as Republican. Eighty-six percent were critical of President Bush's handling of Katrina. The sample was also highly educated — 84% had completed at least a bachelor's degree.
. . .
Our analysis includes participants of all ethnicities although the vast majority (86 percent) were white.

To my mind, this constitutes an implicit admission that the survey does not represent the American people as a whole, and therefore the provocative claims cited at the top of this post are without proper foundation.

The allegation that American harbour “subconscious racial bias” arises from the fact that, on average, the 2300 survey respondents were willing to award a hypothetical white Hurricane Katrina victim government assistance of $1501 per month for 12.38 months.  However, when the hypothetical victim was African-American, the average award was $1498 per month for 11.64 months.  Thus, survey respondents were willing to award a hypothetical white Katrina victim a total of $18,582 and a hypothetical black Katrina victim $17,437.  This is the source of the claim cited at the top, “[T]he ‘penalty’ for being black and a Katrina victim amounted to about $1,000”.

In discussing this result, Dr Iyengar and Mr Morin say:

Those who saw the African-American version of Terry Miller (Medina) [the name of the hypothetical victim] awarded a significantly reduced period of assistance. . . .  Conversely, participants awarded a significantly longer period of assistance after reading about the same Terry Miller, but who now appeared to be white.

From the perspective of technical statistics, that passage uses the word “significant” in a misleading—indeed, nonsensical—manner.

To say that the difference between two averages calculated from sample responses is significant is to say that statistical methods allow us to deduce that there is a very high (typically, 95%) probability that the corresponding parameters in the universe represented by the sample are different.  For example, statistical analysts might say that, based on some public opinion poll, there is 95% probability that support for the Democratic Party among the American electorate is higher than support for the Republican Party.  Such a statement would have to be based on a representative sample randomly selected from the American electorate.  But a representative sample randomly selected is exactly what we do not have here.  Therefore, significance tests cannot be conducted and statements about significant differences are completely out of place.  This survey does not provide a sound basis for conducting tests of statistical significance.  To claim to do so, as the authors appear to have done here, is specious.

Even on its own terms, moreover, the significance test is problematic.  As already noted, the sample was 86% white, but the average dollar amounts of support were apparently calculated using the entire sample—whites and non-whites.  Given that non-whites are implicated in the lower average amount awarded to African-Americans and other racial/ethnic minorities, it seems to me nonsensical to conclude on that basis that “Americans” exhibit subconscious racial bias.

The proper procedure, of course, would be to subdivide the sample into the four racial/ethnic groups named (whites, blacks, Hispanics, and Asians), calculate averages for each of the four sub-groups, and compare those results.  (But, of course, that’s only worth doing if the sample is representative of Americans in each of the four racial/ethnic groups, which it is not.)

As if that wasn’t bad enough, the researchers also conducted experiments with varying the skin tones of the hypothetical Katrina victims.  Photos of individuals from each of the four racial/ethnic groups were altered so that faces with lighter and darker complexions were presented to survey respondents.  Some variation in amounts awarded was found between light and dark-skinned faces within each of the four groups.  Based on this, this claim is made: “This divergence in the effects of skin color for whites and non-whites was statistically significant.”  I note again the statistically meaningless use of “significant”, aggravated here by adding the inapplicable adjective “statistically”.

The researchers also tested the effects on dollar amounts awarded of different types of news coverage.  That was somewhat less controversial (to me anyway), but it too is compromised by the non-representative nature of the sample.

In conclusion, Dr Iyengar and Mr Morin nevertheless insist on trying to draw generally applicable inferences from this hodgepodge.

The effects of the racial identity of individual hurricane victims on the prescribed level of government assistance for all victims are suggestive of what psychologists call the "automaticity" of stereotyping. People cannot help stereotyping on the basis of ethnicity despite their best efforts to act unbiased and egalitarian. As we noted at the outset, this particular sample of participants consisted of highly educated individuals who located themselves toward the liberal end of the political spectrum. Many of them live in and around the nation's capital, one of the more racially diverse and cosmopolitan areas of America. We suspect that this group would score at or very near the top of most measures of support for civil rights and racial equality. Yet their responses to Katrina were influenced by the mere inclusion of racial cues in news media coverage. The fact that this group awarded lower levels of hurricane assistance after reading about looting or after encountering an African-American family displaced by the hurricane is testimony to the persistent and primordial power of racial imagery in American life.

So, they know that their sample does not represent the American people.  (Causing me to ask, “Has this whole exercise been a charade?”)  But that’s OK because they also know that the sample is heavily weighted toward the most “liberal” segment of the population.  The folks in “this group” are said to be strongly committed to “civil rights” and “racial equality”.  The writers assume that, if “this group” shows evidence of subconscious racial bias, obvious bigotry would have been found if they’d bothered to expend the time and resources necessary to conduct a proper survey of all Americans in the first place.  No prejudice there at all.

Shanto Iyenagar is Harry & Norman Chandler Professor of Communication, Professor of Political Science, and Director of the Political Communication Lab, Stanford University.  I searched his website and that of the Political Communication Lab (PCL) for documentation on the survey and unearthed none.  I did find it interesting, however, that the PCL described the news article co-authored with Richard Morin as follows:

Connections between different forms of news coverage and participants' willingness to support government assistance to hurricane victims explored in Washington Post-PCL study.

There is no mention of the alleged subsconscious racism that the two Post articles make such heavy weather about.

Self-selected samples have been discussed on this blog several times because, like some creature from a B-movie horror flick, they keep coming back and being taken seriously.  In fact, in one previous instance, the Washington Post was taken in by a similarly worthless survey based on a self-selected sample.  See other blog posts here and here.

For a critique of a survey analysis that, like this one, made condescending assumptions about those excluded from the sample frame because the sponsor didn’t pony up enough cash to do a proper job, click here.

via OpinionJournal - Best of the Web.