This headline in today's Times of London certainly piqued my interest: "Societies worse off 'when they have God on their side'". The story, by Ruth Gledhill, summarizes a study in the Journal of Religion and Society.
Religious belief can cause damage to a society, contributing towards high murder rates, abortion, sexual promiscuity and suicide, according to research published today.According to the study, belief in and worship of God are not only unnecessary for a healthy society but may actually contribute to social problems.
My interest was piqued for two reasons: I'm a Christian and I'm a statistician. So, I decided I needed to look into this. The first thing to do was find and read the original study, if at all possible. There was no link at the news story to the study or to the Journal of Religion and Society. However, a few minutes with Google and I had tracked them down. The html version of the article can be found here, and the pdf here.
In all honesty, I was not surprised to find that the article does not say what Ms Gledhill reports. (In my professional experience, it is not uncommon for untrained people to misinterpret or misunderstand statistics.) Ms Gledhill's news article uses the language of causation: "Religious belief can cause damage, contributing" toward all kinds of bad stuff. "Belief in God . . . may actually contribute to social problems".
However, the actual article, by a researcher named Gregory S. Paul, is careful to avoid attributing causation. Mr Paul repeatedly states that he is only investigating correlations, not causal relationships. This is one of the most common errors in interpretation of statistics: Correlation is not at all the same thing as causation. I see this fallacy committed all the time. Even Statistics Canada makes this error occasionally. In fact, however, it is impossible to prove that a causal relationship exists between two (or more) variables solely through statistical analysis. (If only it were that easy.)
So, first conclusion: Ruth Gledhill's news report in the Times misrepresents the content of Mr Paul's study. (Note: I am not impugning Ms Gledhill's journalistic integrity. I assume that the misrepresentation was unintentional, but nevertheless that is what happened.)
However, Mr Paul is not off the hook. I found serious—indeed fatal—flaws in his analysis.
I pass over his lengthy and tendentious discussion of evolution and creationism. This, it seems to me, is basically tangential to the main point of his study, which is to examine correlations between religious belief and various social problems. Indeed, this only ends up confusing the issue because sometimes Mr Paul speaks as if belief in God and rejection of evolution are synonymous. But of course they are not. He himself acknowledges the existence of theistic evolutionists and evolutionary creationists, both of whom accept Darwinian evolution. The issue is further clouded by the fact that he never gives precise definitions of "evolution" or "creationism". Thus, at one point he implies that the Roman Catholic Church rejects the theory of evolution when, in fact, Pope John Paul II was the fourth pope in the 20th century to state that there is no conflict between evolution and the faith of the church. (There is no indication that the current pope thinks any differently.) The study would have been much clearer if Mr Paul had omitted all reference to evolution and creationism—but then his paper would have been at least one-third shorter.
Mr Paul proposes to correlate measures of religious faith with data on the occurrence of such social problems as homicide, suicide, sexually transmitted disease, and abortion. The plan of the study is to gather and compare data for countries he refers to variously as "prosperous developed democracies" and "developing democracies". The definition of these terms is never discussed; he just seems to assume we'll all know exactly which countries he's referring to. Eighteen countries are included for data comparison; among those omitted without clear explanation are: Italy, Greece, Finland, Luxembourg, and Belgium. Why are these left out? He mentions in passing that "[t]he especially low rates [of homicide] in the more Catholic European states are statistical noise due to yearly fluctuations incidental to this sample", but no statistical evidence corroborating this assertion is provided. India would seem to fit in with "developing democracies". Why was it excluded? Not "prosperous" enough? Don't know: Mr Paul doesn't say. Why were Russia, Poland, Czech Republic, and the rest of the new eastern European democracies excluded? Don't know: same reason.
So, Mr Paul's sample frame appears arbitrary. Obviously, in a sample of eighteen observations, inclusion or exclusion of only one or two observations can make a big difference in the results.
Another problem with Mr Paul's sample frame is that the time frame of the observations is ambiguous. Referring to data on the social problems under examination, he says: "Data is [sic] from the 1990s, most from the middle and latter half of the decade, or the early 2000s." So, it sounds like the sample of data to be compared uses different reference years for different countries. Nowhere does he list which year pertains to each data observation. At best, this is very sloppy statistical practice. If one were suspicious, one might point out that this makes cooking the results child's play.
Religious faith and acceptance of evolution were the only variables compared with the incidence of social problems. This decision was justified thusly: "The cultural and economic similarity of the developing democracies minimizes the variability of factors outside those being examined." This claim is highly debatable. There are many socio-economic data series that vary widely across the eighteen countries and that plausibly have a significant impact on social conditions, e.g., income distribution, proportion of GDP spent through government, social and cultural cohesion, fertility and mortality rates, age structure of the population, etc., etc. Failure to look at these and other exogenous data would introduce bias into the results, further calling them into question.
Mr Paul presents a series of charts plotting the incidence of religious belief paired with each of the social problem variables. The accepted statistical procedure at this point would be to present estimates of correlation coefficients and measures of goodness-of-fit. These estimates are necessary because they provide the precise quantification of relationships among the observed variables. How closely are they correlated? How much of the variation in the observed data is explained by the correlation? These are the basic nuts and bolts of statistical analyses of this kind. Calculation of these statistics can be done in a few different ways; one of the more common is by use of a technique known as linear regression analysis. So, I couldn't believe my eyes when I read this:
Regression analyses were not executed because of the high variability of degree of correlation, because potential causal factors for rates of societal function are complex, and because it is not the purpose of this initial study to definitively demonstrate a causal link between religion and social conditions. . . . Therefore correlations of raw data are used for this initial examination.
This is simply inexcusable in a research project involving statistical analysis. I have never seen anything like this—either in my professional career or in my university studies of statistics and econometrics.
The three reasons listed for skipping the regression analysis are all bogus. "High variability of degree of correlation" is precisely why goodness-of-fit is estimated. If high variability appears to be an issue, that's all the more reason to run regressions. The second reason, "potential causal factors for rates of societal function are complex", is another reason why regression analysis needs to be conducted: to assess the impact of the unspecified correlated (not necessarily "causal") factors. Furthermore, this reason seems to contradict what Mr Paul said earlier: "The cultural and economic similarity of the developing democracies minimizes the variability of factors outside those being examined". The third reason, that the purpose of the study is not to establish causal relationships, is a red herring. Regression can be used to estimate supposed causal relationships, but it can also be used to calculate correlation coefficients without any implication of causality. In any case, if Mr Paul is worried to avoid causal inferences, there are other equally valid techniques of calculating precise correlation estimates.
Worst of all, Mr Paul's "correlations of raw data" amount to eyeballing the pair-wise data plots. This hardly qualifies as a correlation analysis–or any kind of statistical analysis. How unscientific can you get!
There are many more criticisms that could be raised but, as far as technical statistics goes, these are the most salient ones. In my professional judgment, the statistical and scientific validity of Mr Paul's study, and therewith of Ms Gledhill's news story, cannot be accepted.
UPDATE: Follow-up here.
UPDATE 2: Later development here.









Posts
