I teach a graduate class in data analysis, and we are going to critically assess a few papers in our last seminar of the year. I thought a good starting point would have been my own work, since there are myriad flaws in most studies I’ve done. But then I stumbled on a recently published paper on the topic of perception and immigration. I decided that his paper would make a much more interesting example for discussion and illustration.
Noah Carl, a graduate student in the UK, published a paper titled “Net opposition to immigrants of different nationalities correlates strongly with their arrest rates in the UK“. I was struck by a combination of 1) how bad the paper is and 2) the potential impact of the results on public perception and policy. Nobody is perfect, so it’s reasonable to be at least somewhat forgiving of bad research, especially of early career scholars. However, given the ease with which debates about immigration can be hijacked by misinformation, accusation and racism, I think academics should ensure that their contribution to the debate is beyond reproach (or at least close to it).
It’s worth noting that the paper was published in an bottom-tier online journal. Further, while my review is very critical, my intent is not to pillory the student, but rather critique his work. Finally, I have no idea what is true. His conclusions could be right or wrong; I simply contend that this particular research offers no insight on the matter either way, and that research this bad should never be published in any form.
What does the researcher claim?
The paper looks at the association between British perceptions of immigrants by country of origin and the crime rates of these immigrant populations in the UK. The paper concludes that since the rate of crimes committed by immigrants correlates with the perception of immigrants, ‘public beliefs about immigrants are more accurate than often assumed.’
The conclusions are inconsistent with the evidence
I will go through the specifics of the paper as I critically assess it.
1. Ecological study design problem
The research design is ecological. This means that the data are not individuals, but aggregates (groups) of individuals. This is probably the weakest study design in the social sciences; it is not only observational, but does not actually measure anything about people, but rather, just aggregations of people. One consequence of this is that these research designs tend to over-estimate model fit. That is, any effects estimated tend to fit more poorly in the real world than they do in the model. This is because these study designs usually under-estimate variability. In this example, had the author used individual data on the perception of immigrants rather than averages to fit his models, he probably would have seen a weaker relationship than he observed.
2. Small sample problem
In addition to being a weak study design, the author relies on 23 observations to draw his conclusions. Statistics can make up for small samples when study designs are strong and variables are measured without systematic error, but the small sample size used in this study is particularly troubling when combined with all the other problems with the study. Small samples are a multiplier of all other problems.
3. Bad sample
The researcher did not look at all immigrant data in the UK, but a small non-random sample of 23 countries. There are a large number of Italian and Portuguese immigrants to the UK, but these data are not included in the study. If they were included, the results may have looked different. When the data we use are not exhaustive (complete) and not selected randomly, there is always the possibility that the selection of data used will affect out findings in a systematic way. This is particularly problematic when the sample is small; a small non-random sample is the holy grail of statistical badness.
4. Missing variable problem
The author uses a multiple regression model to control for the ‘confounding’ effect of things like whiteness, English speaking, being from a Western country, and religion on his observation that crime rates influence perception of immigrants. He did not control for other potential confounding effects, however–like the economic wealth / productivity of the country of origin, historical tensions or media portrayals. I added per capital GDP to his data set an observed that the log of per capita GDP correlates more strongly with perception of immigrants than the the log of crime rate. It’s hard to know what variables to include in an analysis like this, but it matters, as the inclusion and exclusion of variables can change how data are interpreted.
5. Non sequitur
Carl draws the conclusion that ‘public beliefs about immigrants are more accurate than often assumed’, but the bulk of his analysis does not meaningfully address this claim. Carl has not defined what ‘accurate’ is, but no reasonable definition can be boiled down only to crime rate–that is, the negative contributions of immigrants. If public opinions were really ‘accurate’, their perceptions would also correlate with the positive contributions of immigrants, and in fact there would be strong correlation between net utility of immigrants and the perception of immigrants. But Carl focuses only on one possibly useful measure, and ignores the rest. As such,even if the technical difficulties above were overlooked, his conclusion is a misdirection since he’s not really measuring accuracy of public opinion.
My conclusion
There is more bad to say about this research, but I think I’ve made my point.
The author may respond by saying he did the best he could given the data available, but this is not an appropriate defense. There is a useful saying about putting lipstick on a pig; this research is an attempt to make good use of bad data, but it would have been better to have simply collected better data first rather than trying to dress it up to look pretty.
In short, it is never OK to publish research this bad, even in a inconsequential online journal. Let me repeat, I have no idea what is true here–the author may or may not be correct–I have no idea. What I know is that this research leaves so much to be desired that it requires more qualification than the word limits of a journal article would typically allow.
What makes the research worse is its very implications on public perception of immigrants and immigration. This research can (and dare I say will) be read and interpreted in support of a certain perspective on immigration. Research on fraught subject matter has to be held to an especially high standard of scientific rigour since conversations on these kinds of topics can easily reduce to name calling and hate. Scientific contributions to such debates are important, but can be undermined by shoddy work.