Interpreting weak effects in large studies: is dementia associated with proximity to roads?

An Ontario study investigating the risk of dementia associated with living near major highways has been getting press attention from around the world recently.  The results report that living near a major highway is responsible for a 7% increase in risk of dementia, a fairly small effect compared to many epidemiological studies.  As an academic with a healthy dose of natural skepticism (bordering on an unhealthy dose of cynicism) I was immediately doubtful of the authors’ research findings, so I read the abstract and skimmed the paper.  I saw some study design weaknesses that when combined with the small effect size, suggest that the results should be interpreted with great caution, and reported with considerable qualification.  I will briefly comment one specific problem I see with the research, though there are several worth consideration.

What’s the problem?

The study design is a retrospective cohort, and very large; for this particular part of the study, there were over 2 million persons involved.  Large study sizes have some important advantages; for example, they mean results are likely to be more generalizable to the population as a whole. They also have more power to detect weak (though ‘statistically significant’) effects.  As I’ve noted other times on my blog, large data are better at detecting all effects–true and false.  For this reason, big data research has to be as rigorous as possible–one small systematic error can be enough to greatly affect the interpretation of the data, particularly when effects are small.

In this study, one important methodological shortcoming should cast some doubt on the observations the researchers make.  Specifically, the authors have not properly controlled for the confounding effect of income.

There is evidence that dementia has an association with income [1,2], and evidence that lower income is associated with living closer to major highways [3].  If the effect of income were not taken into account in this research, it could bias any association between dementia and living near a road–‘confounding’ our interpretation of the effect of interest.  In this particular case, the likely confounding is to produce a positive bias in the effect of interest, making the relationship between living near a busy highway and dementia seem stronger than it actually is.

To control for this, the authors did not use subject income, but rather, used neighbourhood income from the Canadian 2001 census.  Neighbourhood income is an imprecise measure of individual income, and therefore does not fully resolve the confounding problem.  Indeed, what remains is residual confounding, the effect of which is (in this situation) a probable bias in the estimated association between dementia and living near a busy highway.  This is true even if the error in neighbourhood income is random.  How much of a bias is unclear, but given the small size of the detected effect, could easily undermine the main conclusion of the paper.  You can see a simple example of this effect in this Google Sheet I prepared.

The media’s role

In spite of this, the research is probably still publication worthy.  The fundamental science is not unreasonable–which is to say, there is a plausible biological explanation for how exposure to air pollution could result in some effect on human systems–including the brain.  Furthermore, this study is building on other research in this area [4].  However, the modest observed effect combined with the methodological shortcomings (specifically residual confounding of income) require a high degree of qualification on the part of the authors, as well as the media writing about it

Unfortunately, once research like this gets reported in the media (and pumped by the PR staff at the journal and the affiliated universities), qualifications are often lost–especially in newspaper headlines.  As of today (January 6, 2017), here are some headline examples:

ex6 ex5 ex4 ex3 ex2 ex1

Perhaps most readers will be thoughtfully suspicious about the results of this research, or follow up with a critical analysis of the original article, but I doubt it.  It is quite likely that many people will file these headlines into their memories as evidence of something substantive–perhaps that highways are causing dementia, or that academics have a nefarious agenda to attack motor-vehicle culture.  In either case, promoting this particular study as an important contribution to our understanding of the environmental risk factors for dementia is problematic since it lacks the rigour to justify influencing public assessments of risk or our understanding of the world.

My conclusion

The apparently biggest strength of the study (the size of the cohort) is part of the problem, since it is the study size that makes the result seem important. Ceteris paribus, a large study with small biases is more likely to produce small but ‘statistically significant’ false effects than a small study with small biases.  For this reason, I think it is often good practice to interpret effect size and study size together, and that one should be especially suspicious of large studies with small apparent effects.  Large studies with methodological flaws are becoming more common in this era of big data, which means that researchers, policy makers and the public need to be more vigilant than ever, and take great care in their interpretation of findings.