Earlier this week, British medical journal The Lancet published a study estimating that, since the US-led invasion in March 2003, almost 655,000 Iraqis have died who would not have died had the invasion not occurred. That estimate is far above previous estimates of post-invasion Iraqi deaths, which generally range between 40,000 and 120,000. Immediately, the study received widespread attention and generated a great deal of controversy in the media, in the halls of government, and around the blogosphere.
The article is entitled “Mortality after the 2003 invasion of Iraq: a cross-sectional cluster sample survey” by Gilbert Burnham, Riyadh Lafta, Shannon Doocy, and Les Roberts. Drs Burnham, Doocy, and Roberts are affiliated with the Johns Hopkins Bloomberg School of Public Health, Baltimore, and Dr Lafta with the Mustansiriya University, Baghdad. The full text is available here in html, and here as a pdf document. (All page references to the study in this post refer to the pdf version.)
I put on my professional statistician's hat and had a good long look at the study. In my opinion, it is statistically unsound and unreliable. The study violates the basic principle of good statistical practice by relying on a non-random sample survey. Also, the article's description of survey operations raises reliability, and perhaps even credibility, questions.
The study is based on a sample survey conducted between May and July of this year utilising a cluster sample methodology. Cluster sampling is a multi-stage procedure to select sample respondents. In the first stage, clusters, or small areas, of the region (in this case, Iraq) to be surveyed are selected. Within the clusters, neighbourhoods are selected, and then main streets; finally, particular residences are chosen and surveyed. (More details are given below.)
Forty-seven clusters were selected in proportion to the population of 16 of the country's 18 Governorates. (Originally, 50 clusters were to be surveyed representing all Governorates, but operational problems necessitated omission of three.) Within each of the clusters, administrative units and main streets were chosen at random in proportion to population; then particular residential streets were chosen at random where households were surveyed.
[S]election of survey sites was by random numbers applied to streets or blocks . . . [p. 2]
The plan was to interview forty households per cluster but, due to the vagaries of field operations under potentially dangerous conditions, fewer than 40 households were surveyed in some clusters. Thus, a sample of 1849 households with an average of 6.9 persons per household were surveyed, comprising a total of 12,801 individuals.
Here arises a problem with the purported randomness of the cluster selection. According to the methodology as just outlined, all of the 47 clusters were located in urban areas. Rural areas do not have “streets or blocks” as such, nor do they have residential streets with 40 adjacent households. According to the study’s own documentation, every cluster was located in an urban area; none was selected in a rural area.
According to the UN's 2004 Iraq Living Conditions Survey (ILCS), however, 7,132,000 of Iraq's total population of 27,132,000 live in rural areas. (See Table 1.6 on page 22 [numbered 21] of this pdf document.) Some 26% of Iraq's population live in rural areas, but not one of the 47 clusters was located in a rural area. The probability that, if a true random selection were made, all 47 clusters would be chosen from urban areas is 74% raised to the 47th power—a very small number indeed. It would appear that an a priori decision was made to exclude rural areas from consideration as cluster sites. In that case, the selection of sample respondents was not random. There are, I would think, good reasons for believing that armed conflict in urban areas is likely to kill more people than armed conflict in rural areas, other things being equal. It is therefore probable that the Lancet survey, because it includes only urban residents, is biased toward producing an overestimate of deaths.
Serious questions are also raised by the description of field operations, according to which the survey went smoother than any survey I’ve ever heard of.
There were two survey teams, each consisting of two female and two male interviewers, and one supervising field manager. The survey was in the field between 20 May and 10 July 2006. Survey respondents were chosen according to the procedure outlined above. Once a particular residential street was selected within an administrative unit within a cluster, a start household on the street was chosen at random. Beginning with that household, the interview team proceeded to survey adjacent households until forty were done. Here’s an outline of the survey content.
The survey purpose was explained to the head of household or spouse, and oral consent was obtained. Participants were assured that no unique identifiers would be gathered. No incentives were provided. The survey listed current household members by sex, and asked who had lived in this household on January 1, 2002. The interviewers then asked about births, deaths, and in-migration and out-migration, and confirmed that the reported inflow and exit of residents explained the differences in composition between the start and end of the recall period. Separation of combatant from non-combatant deaths during interviews was not attempted, since such information would probably be concealed by household informants, and to ask about this could put interviewers at risk. Deaths were recorded only if the decedent had lived in the household continuously for 3 months before the event. Additional probing was done to establish the cause and circumstances of deaths to the extent feasible, taking into account family sensitivities. At the conclusion of household interviews where deaths were reported, surveyors requested to see a copy of any death certificate and its presence was recorded. Where differences between the household account and the cause mentioned on the certificate existed, further discussions were sometimes needed to establish the primary cause of death. [p. 2]
Now check this summary of field operations:
In 16 (0·9%) dwellings, residents were absent; 15 (0·8%) households refused to participate. [p. 4]
The interview team went to 1849 households in urban areas of Iraq and encountered only 15 refusals and only 16 residences where neither the head of the household nor a spouse was in. Don’t forget that they only went to each household once: there was no follow-up whatever. If I ran a door-to-door survey with a response rate of 98.3% on the first go-round, I’d think I’d died and gone to statisticians’ heaven. That is nothing short of miraculous. That response rate implies that family heads in urban Iraq are virtually always at home.
Don’t heads of households and their spouses in urban Iraq have jobs? Don't they go out to meet friends? Do they never visit relatives in other neighbourhoods or towns? Do they not engage in any activities outside their homes? Are they never in the middle of a family meal and don’t want to be interrupted by unknown visitors asking intrusive personal questions? Never out shopping for groceries or passing the time of day at a local coffee shop or dropping off the family car at the mechanic’s? Do they just stay around the house all day every day? In short, do those folks living in urban Iraq have any semblance of normal lives?
I realise that armed conflict would impel most people to huddle in their homes behind locked doors (in which case they would be unlikely to open the door to strangers), but that possibility doesn’t enter into it because the locations selected for interview were altered if they appeared unsafe.
Decisions on sampling sites were made by the field manager. The interview team were given the responsibility and authority to change to an alternate location if they perceived the level of insecurity or risk to be unacceptable. [p. 2]
Admittedly, I have no personal experience of daily life in Iraq. Nevertheless, the 98.3% initial response rate is foreign, not just to my experience, but to any real-world survey situation imaginable.
Here's another strange remark about this survey's field operations:
One team could typically complete a cluster of 40 households in 1 day. [p. 4]
According to the summary of the survey content, quoted above, there’s a lot of ground to cover in each interview. Locate the head of household or spouse (fortunately, 99.1% of ‘em were at home when the interviewers showed up), and obtain oral consent. List by age and sex everyone living there now and everyone who lived there on a particular date over four years ago. Find out what happened to each of them and when, and write it all down. Focus on the ones who had died: find out the cause and circumstances of death; then ask to see the death certificate. If they have one (as 92% did), have them dig it out so the interviewer can take a good look at it. If there’s a discrepancy between the official cause of death and the one reported by the interviewee, hash that out. (The more I think about all that, the more unlikely that 0.8% refusal rate seems.)
Suppose each survey team is working 10-hour days. Even that’s pushing it because survey operations must be conducted with a view to finding respondents at home and willing to talk. (But apparently that's not a problem in urban Iraq.) That’s an average of four surveys per hour, i.e., one every fifteen minutes. Granted some interviews would be short: a husband and wife living alone for the past five years would only take a few minutes. Since the average household has over six members, however, interviews are much more likely to be lengthy. Also, the interviewers need meal and other breaks. The assertion that 40 households could be interviewed in one day strains credibility.
Another discrepancy in the article’s description of operations raises the disturbing possibility that the survey could have been tainted by surveyor bias. Here’s the methodological description of the selection of respondent households.
The third stage consisted of random selection of a main street within the administrative unit from a list of all main streets. A residential street was then randomly selected from a list of residential streets crossing the main street. On the residential street, houses were numbered and a start household was randomly selected. From this start household, the team proceeded to the adjacent residence until 40 households were surveyed. For this study, a household was defined as a unit that ate together, and had a separate entrance from the street or a separate apartment entrance. [p. 2]
An administrative unit within the cluster was chosen at random, a main street within the administrative unit was chosen at random, a residential street crossing the main street was chosen at random, and a start household on the residential street was chosen at random. The interview team has no discretion whatever in the selection of survey respondents, with one exception (as already cited above):
The interview team were given the responsibility and authority to change to an alternate location if they perceived the level of insecurity or risk to be unacceptable. [p. 2]
The article doesn’t say how often the interview team exercised its discretion to change to an alternate location. To me, that is a serious omission, unless we are to understand that this never, or rarely, happened. In any case, no instances are reported of interviewers coming under fire or other threat, so that would appear to have been a very unusual circumstance.
Why then does this statement appear in the article?
Although interviewers used a robust process for identifying clusters, the potential exists for interviewers to be drawn to especially affected houses through conscious or unconscious processes. Although evidence of this bias does not exist, its potential cannot be dismissed. [p. 7, footnote omitted]
How could interviewers be “drawn” to particular houses if the selection of households was driven by a completely random process, except when interviewers felt insecure or otherwise at risk? The quoted statement doesn’t make sense in the context of what is supposed to be random choice of particular streets and households. It only raises further serious doubts about the sample selection process.
There are many other problems with the Lancet study that could be discussed. What I’ve presented here, however, is more than sufficient to demonstrate that the survey behind the estimate of “excess” deaths was statistically unsound because biased by non-random selection of interview respondents. Moreover, the article’s description of survey field operations is, in the absence of further supporting documentation, highly problematic.
In my judgment, the estimate of 655,000 deaths lacks solid foundation and therefore should not be relied upon.
UPDATE (18 Oct.): Follow-up here.
UPDATE (22 Oct.): Further critique: "Main street bias" in Lancet study









Posts
