[ Many right-wingers continue to dismiss reports about the John Hopkins University study published in the British medical journal The Lancet which concluded, at the time of the study, there were probably about 100,000 excess deaths since the Anglo-American invasion. But my informal survey suggests that very few people have heard the telling interview about the study’s methodology which aired on This American Life (“What’s in a Number?,” 28 October 2005, episode 300) with one of the researchers (to get to the most relevant point in the interview, move ahead 12 minutes into the show). The article below, forwarded by Eva Dadlez, provides a further defense of the methodology, and suggests the 100,000 figure may well be low. –BL ]
by ANDREW COCKBURN
President Bush’s off-hand summation last month of the number of Iraqis who have so far died as a result of our invasion and occupation as “30,000, more or less” was quite certainly an under-estimate. The true number is probably hitting around 180,000 by now, with a possibility, as we shall see, that it has reached as high as half a million.
But even Bush’s number was too much for his handlers to allow. Almost as soon as he finished speaking, they hastened to downplay the presidential figure as “unofficial”, plucked by the commander in chief from “public estimates”. Such calculations have been discouraged ever since the oafish General Tommy Franks infamously announced at the time of the invasion: “We don’t do body counts”. In December 2004, an effort by the Iraqi Ministry of Health to quantify ongoing mortality on the basis of emergency room admissions was halted by direct order of the occupying power.
In fact, the President may have been subconsciously quoting figures published by iraqbodycount.org, a British group that diligently tabulates published press reports of combat-related killings in Iraq. Due to IBC’s policy of posting minimum and maximum figures, currently standing at 27787 and 31317, their numbers carry a misleading air of scientific precision. As the group itself readily concedes, the estimate must be incomplete, since it omits unreported deaths.
There is however another and more reliable method for estimating figures such as these: nationwide random sampling. No one doubts that, if the sample is truly random, and the consequent data correctly calculated, the sampled results reflect the national figures within the states accuracy. That, after all, is how market researchers assess public opinion on everything from politicians to breakfast cereals. Epidemiologists use it to chart the impact of epidemics. In 2000 an epidemiological team led by Les Roberts of Johns Hopkins School of Public Health used random sampling to calculate the death toll from combat and consequent disease and starvation in the ongoing Congolese civil war at 1.7 million. This figure prompted shocked headlines and immediate action by the UN Security Council. No one questioned the methodology.
In September 2004, Roberts led a similar team that researched death rates, using the same techniques, in Iraq before and after the 2003 invasion. Making “conservative assumptions” they concluded that “about 100,000 excess deaths” (in fact 98,000) among men, women, and children had occurred in just under eighteen months. Violent deaths alone had soared twentyfold. But, as in most wars, the bulk of the carnage was due to the indirect effects of the invasion, notably the breakdown of the Iraqi health system. Thus, though many commentators contrasted the iraqbodycount and Johns Hopkins figures, they are not comparable. The bodycounters were simply recording, or at least attempting to record, deaths from combat violence, while the medical specialists were attempting something far more complete, an accounting of the full death toll wrought by the devastation of the US invasion and occupation.
Unlike the respectful applause granted the Congolese study, this one, published in the prestigious British medical journal The Lancet, generated a hail of abusive criticism. The general outrage may have been prompted by the unsettling possibility that Iraq’s liberators had already killed a third as many Iraqis as the reported 300,000 murdered by Saddam Hussein in his decades of tyranny. Some of the attacks were self-evidently absurd. British Prime Minister Tony Blair’s spokesman, for example, queried the survey because it “appeared to be based on an extrapolation technique rather than a detailed body count”, as if Blair had never made a political decision based on a poll. Others chose to compare apples with oranges by mixing up nationwide Saddam-era government statistics with individual cluster survey results in order to cast doubt on the latter. Some questioned whether the sample was distorted by unrepresentative hot spots such as Fallujah.
In fact, the amazingly dedicated and courageous Iraqi doctors who actually gathered the data visited 33 “clusters” selected on an entirely random basis across the length and breadth of Iraq. In each of these clusters the teams conducted interviews in 30 households, again selected by rigorously random means.
As it happened, Fallujah was one of the clusters thrown up by this process. Strictly speaking, the team should have included the data from that embattled city in their final result – random is random after all — which would have given an overall post-invasion excess death figure of no less than 268,000. Nevertheless, erring on the side of caution, they eliminated Fallujah from their sample.
For such dedication to scholarly integrity, Roberts and his colleagues had to endure the flatulent ignorance of Michael E. O’Hanlon, sage of the Brookings Institute, who told the New York Times that the self-evidently deficient Iraqbodycount estimate was “certainly a more serious work than the Lancet report”.
No point in the study attracted more confident assaults by ersatz statisticians than the study’s passing mention of a 95 per cent “confidence interval” for the overall death toll of between 194,000 and 8,000. This did not mean, as asserted by commentators who ought to have known better, that the true figure lay anywhere between those numbers and that the 98,000 number was produced merely by splitting the difference. In fact, the 98,000 figure represents the best estimate drawn from the data. The high and low numbers represented the spread, known to statisticians as “the confidence interval”, within which it is 95 per cent certain the true number will be found. Had the published study (which was intensively peer reviewed) cited the 80 per cent confidence interval also calculated by the team – a statistically respectable option — then the spread would have been between 152,000 and 44,000. Seeking further elucidation on the mathematical tools available to reveal the hidden miseries of today’s Iraq, I turned to CounterPunch’s consultant statistician, Pierre Sprey. He reviewed not only the Iraq study as published in the Lancet, but also the raw data collected in the household survey and kindly forwarded me by Dr. Roberts.
“I have the highest respect
for the rigor of the sampling method used and the meticulous
and courageous collection of the data. I’m certainly not criticizing
in any way Robert’s data or the importance of the results.
But they could have saved themselves a lot of trouble had they
discarded the straitjacket of Gaussian distribution in favor
of a more practical statistical approach”, says Sprey.
“As with all such studies, the key question is that of ‘scatter’
i.e. the random spread in data between each cluster sampled.
So cluster A might have a ratio of twice as many deaths after
the invasion as before, while cluster B might experience only
two thirds as many. The academically conventional approach is
to assume that scatter follows the bell shaped curve, otherwise
known as ‘normal distribution,’ popularized by Carl Gauss in
the early 19th century. This is a formula dictating that the
most frequent occurrence of data will be close to the mean, or
center, and that frequency of occurrence will fall off smoothly
and symmetrically as data scatters further and further from the
mean – following the curve of a bell shaped mountain as you move
from the center of the data.
“Generations of statisticians
have had it beaten in to their skulls that any data that scatters
does so according to the iron dictates of the bell shaped curve.
The truth is that in no case has a sizable body of naturally
occurring data ever been proven to follow the curve”. (A
$200,000 prize offered in the 1920s for anyone who could provide
rigorous evidence of a natural occurrence of the curve remains
“Slavish adherence to
this formula obscures information of great value. The true shape
of the data scatter almost invariably contains insights of great
physical or, in this case medical importance. In particular
it very frequently grossly exaggerates the true scatter of the
data. Why? Simply because the mathematics of making the data
fit the bell curve inexorably leads one to placing huge emphasis
on isolated extreme ‘outliers’ of the data.
“For example if the average
cluster had ten deaths and most clusters had 8 to 12 deaths,
but some had 0 or 20, the Gaussian math would force you to weight
the importance of those rare points like 0 or 20 (i.e. ‘outliers’)
by the square of their distance from the center, or average.
So a point at 20 would have a weight of 100 (20 minus 10 squared)
while a point of 11 would have a weight of 1 (11 minus 10 squared.)
“This approach has inherently
pernicious effects. Suppose for example one is studying survival
rates of plant- destroying spider mites, and the sampled population
happens to be a mix of a strain of very hardy mites and another
strain that is quite vulnerable to pesticides. Fanatical Gaussians
will immediately clamp the bell shaped curve onto the overall
population of mites being studied, thereby wiping out any evidence
that this group is in fact a mixture of two strains.
“The commonsensical amateur
meanwhile would look at the scatter of the data and see very
quickly that instead of a single “peak” in surviving
mites, which would be the result if the data were processed by
traditional Gaussian rules, there are instead two obvious peaks.
He would promptly discern that he has two different strains
mixed together on his plants, a conclusion of overwhelming importance
for pesticide application”.
(Sprey once conducted such a statistical study at Cornell – a bad day for mites.)
So how to escape the Gaussian distortion?
“The answer lies in quite
simple statistical techniques called ‘distribution free’ or ‘non
parametric’ methods. These make the obviously more reasonable
assumption that one hasn’t the foggiest notion of what the distribution
of the data should be, especially when considering data one hasn’t
seen — before one is prepared to let the data define its own
distribution, whatever that unusual shape may be, rather than
forcing it into the bell curve. The relatively simple computational
methods used in this approach basically treat each point as if
it has the same weight as any other, with the happy result that
outliers don’t greatly exaggerate the scatter.
“So, applying that simple
notion to the death rates before and after the US invasion of
Iraq, we find that the confidence intervals around the estimated
100,000 “excess deaths” not only shrink considerably
but also that the numbers move significantly higher. With a
distribution-free approach, a 95 per cent confidence interval
thereby becomes 53,000 to 279,000. (Recall that the Gaussian
approach gave a 95 per cent confidence interval of 8,000 to
194,000.) With an 80 per cent confidence interval, the lower
bound is 78,000 and the upper bound is 229,000. This shift
to higher excess deaths occurs because the real, as opposed to
the Gaussian, distribution of the data is heavily skewed to the
high side of the distribution center”.
Sprey’s results make it clear that the most cautious estimate possible for the Iraqi excess deaths caused by the US invasion is far higher than the 8,000 figure imposed on the Johns Hopkins team by the fascist bell curve. (The eugenicists of the 1920s were much enamored of Gaussian methodology.) The upper bounds indicate a reasonable possibility of much higher excess deaths than the 194,000 excess deaths (95 per cent confidence) offered in the study published in the Lancet.
Of course the survey on which all these figures are based was conducted fifteen months ago. Assuming the rate of death has proceeded at the same pace since the study was carried out, Sprey calculates that deaths inflicted to date as a direct result of the Anglo-American invasion and occupation of Iraq could be, at best estimate, 183,000, with an upper 95 per cent confidence boundary of 511,000.
Given the generally smug and heartless reaction accorded the initial Lancet study, no such updated figure is likely to resonate in public discourse, especially when it registers a dramatic increase. Though the figures quoted by Bush were without a shadow of a doubt a gross underestimate (he couldn’t even be bothered to get the number of dead American troops right) 30,000 dead among the people we were allegedly coming to save is still an appalling notion. The possibility that we have actually helped kill as many as half a million people suggests a war crime of truly twentieth century proportions.
In some countries, denying the fact of mass murder is considered a felony offence, incurring harsh penalties. But then, it all depends on who is being murdered, and by whom.
Andrew Cockburn is the co-author, with Patrick Cockburn, of Out of the Ashes: the Resurrection of Saddam Hussein.