With a month of new Covid-19 data and employment data, the evidence I made reference to in my previous post can be updated. I’ll make a quick note here about it.
In August, the US saw more than 30,000 Covid-19 deaths. Canada saw 206. This amounts to a near 17 times difference in the rate of death–meaning that the average American was 17 times more likely to die from Covid-19 than the average Canadian in the month of August. I am not aware of any other cause of death, disease or injury that shows an equivalently striking difference between these two countries 1. For reference, the homicide rate in the US is only 2.8 times that of Canada.
The trend in employment growth remains about the same as in previous months. Canada saw a larger percent improvement in employment growth in August at 1.28% when compared to 1.08% in the US, but Canada is still slightly behind the overall return to normal (Canada is 5.57% below January employment and the US is 5.24% below January).
So far the message is clear: the behaviour of Canadians has saved lives that would not have been saved if Canadians adopted the US response to Covid-19. Let’s keep at it. Whatever we are doing seems to be working fairly well in comparison to the most available alternative approach.
I did a quick and dirty estimate of curling related injuries based on some research and Internet-based data. Google says there are around 436,000 regular curlers Canada. In the US, Google says there are 25,000 registered curlers. The injury rate (per curling exposure) has been estimated at around 2 per 1000. Putting these numbers together suggests that the average Canadian has a 17 times greater risk of curling injury than the average American.
Here we are in August 2020 and the pandemic seems to be firmly attached to our short and medium term future. Many of us are holding out hope that vaccines will bring everything back to normal in short order. Yet questions remain. Will we see effective vaccines soon? Will they slow the spread of infections? Will they be safe? Will they be accessible?
Other questions–about the safety of returning to school, the risk of a second wave, and the long term impact on the economy add yet further uncertainty about our future.
The uncertainly may leave you feeling a bit depressed. But there are reasons to feel optimistic. While the future is uncertain, in Canada (and many other countries) there is growing evidence that this infection is manageable (masks, physical distancing and hygiene all seem to work), and returning to quasi-normal life could happen before an effective vaccine is widely available.
The purpose of this short essay is to make an argument that things could be much worse and that Canadians should take some satisfaction in the how we’ve handled the Covid-19 situation so far. I make this argument simply by drawing a comparison between Canada and our closest counterfactual: The United States of America.
Part 1. The epidemiological argument
In between March 1 and July 31st 2020, more than 150,000 Americans died from SARS-CoV-2 infection. During the same period, roughly 9,000 Canadians died from SARS-CoV-2. In per-capita terms, the US saw about twice the rate of death than Canada. This is widely known, and on the face of it suggests that Canada did something right that the US did not do. However, this difference is far more striking when viewed on a monthly basis. Take a look at the table below:
Looking at deaths per 100,000 for April, May and June, the ratio of rates hovers around 2 to 1 (the US has roughly twice the death rate of Canada). Who knows what accounts for this–is it access to health care? Population vulnerability? Environmental differences? It’s hard to say. What is clear is that in July, the ratio of the Covid-19 death rates changes dramatically: in the US, the rate is 7 times higher than in Canada. Such a short term change is almost certainly not due to some structural difference between the countries (like health care access or underlying health status), but due to short-term policy decisions and behaviours in the US that occurred in June and July. What specific decisions and behaviours, who can say. What is clear is that a country to country comparison strongly suggests that something right happened in Canada and/or something wrong happened in the US.
To make the difference tangible, I’ve also calculated attributable risk seen in the last two columns in the table above. The attributable deaths in the US are the deaths that occurred as a result of the excess mortality rate in the US. In simple terms, these are deaths that would not have occurred if the US had the same risk profile as Canada. In the last column these are deaths that would have happened if Canada had the US mortality rate. We can think of these as lives saved due to our decisions and behaviour that were different from those in the US.
Again, the July data are most striking. If Canadians had the same risk profile as the US, an additional 2336 people would have died in July alone. In the US, more than 20,000 people died in July that would not have died if the they had Canada’s risk profile.
Part 2. The economic argument
Some have suggested that the US made a calculated trade off. The argument is that unemployment causes death too. By shutting down economies people lose jobs, which will cause more death as a result of unemployment and economic desperation. By returning to normal faster, there was more Covid-19 death in the US, but less unemployment-related-death.
What do the unemployment numbers say? Take a look for yourself:
In total, the US has slightly fewer net job losses than Canada for 2020–by about half a percent. However, the data also show that the timing of job losses differ between these countries. In Canada, there was a rapid drop in employment and a slightly lagged recovery, but more job growth in June and July. Once the August numbers come out, Canada may yet end up better off overall. In any case, at the moment, the difference in Covid-19 related employment impacts between these two countries seems pretty small.
It’s hard to say definitively if the small difference in employment recovery makes up for the excess loss of life due to Covid-19. In the US, half a percent of the workforce corresponds to about 700,000 workers. This means that if the US had Canada’s employment profile, 700,000 fewer Americans would have a job at the end of July. Are 700,000 jobs a worthwhile trade-off in exchange for the lives of the 72,000 excess deaths due to Covid-19? It’s hard to judge. However, the relationship between unemployment and mortality risk does depend on time; short term unemployment is more weakly associated with increased mortality than long term unemployment. Many of the job losses thus far have been short lived. Moreover, the wealth transfers (the CERB in Canada and whatever they have in the US) would probably offset some of the trauma of losing a job. So it seems unlikely that unemployment due to Covid-19 will have lead to a spike in mortality similar to equivalent unemployment growth in ‘normal’ times.
The US and Canada are suitable for comparison because of the timing of infection, and the similarities in culture and population vulnerability. Comparisons between these two countries are more meaningful than comparisons between either of these countries and any other country in the world. In this comparison, Canada comes out clearly ahead.
Many countries have lower mortality than the US and Canada, though Canada’s mortality rate is (as of July) low by any standard of comparison. Lots of things went wrong in Canada, and maybe thousands of deaths could have been prevented. However, in comparison to the most natural alternative (the US) Canadians should be confident that some of the actions we took clearly saved lives, and that if we continue on this course, we can prevent many more unnecessary deaths.
Consider the following. After years of study, researchers estimate with a high degree of certainty that there is a 60% chance of a particular event, (call it A), happening. When asked to make a discrete prediction of whether or not Awill actually happen at a moment in time, 100 out of 100 experts independently conclude that the event will happen.
Now consider this. After years of study, researchers estimate with a high degree of certainty that there is a 50% chance of a particular event, (call it B), happening. When asked to make a discrete prediction of whether or not B will actually happen at a moment in time, 50 out of 100 experts independently conclude that the event will happen.
The expert predictions in both of these scenarios are perfectly rational. These independent expert predictions provide the most accurate long-run information about the whether or not A and B will happen. However, in the second scenario the aggregate prediction (e.g., by taking the average) is precisely correct, and the first scenario the aggregate prediction is infinitely wrong.
If you want to see a real world example, take a look at the predictions of 18 experts on the NHL post season for 2020:
All 18 experts predict that Pittsburgh is going to win their playoff series. For each expert this prediction makes sense–by most measures, Pittsburgh is the better team. However, this information does not give me a realistic representation of the actual probability that Pittsburgh will win. As bad as Montreal is, they have a better than 0% chance of winning the series.
In contrast, if we sum the total number of experts predicting New York will win and divide it by the total number of predictions, New York is given a 56% chance of winning their series. This number is probably a pretty good long-run estimate of the probability that New York will win the series. There is no consensus, and that actually yields a more realistic aggregate prediction!
What this quasi-paradox suggests is that the closer experts are to a consensus about an event, the more likely we are to get a bad aggregate prediction of the true probability of an event. If we combine the expert predictions, we will think that the event is more (or less) probable than it actually is.
This is a reminder of why when consulting an expert, we should not ask if something will happen, but instead ask about the probability that something will happen. Among other things, this probability is something we can average across experts to get a sort of ‘meta’ prediction.
It is also a reminder not to mistake an expert consensus about an event as equivalent to a guarantee that the event will happen.
In previous posts, I have advocated surveillance of the novel coronavirus through testing of random samples of the population (or some quasi-random sampling method that yields equivalent estimates of infection). I still think this is a good idea, though there remain practical concerns.
Some people are not only clamouring for more tests, but are demanding widespread use of serological testing as well. These are blood tests that detect indirect evidence of infection related to our immune system. These tests are important for determining what proportion of the population has been exposed to the novel coronavirus, and could be informative for understanding the impact of the disease, and possibly the level of immunity in the population.
More testing is useful, but it does require a level of caution in interpretation, particularly since all tests are imperfect–both failing to detect true cases, and falsely detecting non-cases. Serological tests for the novel coronavirus could have a higher error rate than the polymerase chain reaction (PCR) tests. PCR tests work by detecting actual genetic material from the virus, and serological tests detect antibodies that are signs of past infection. However, for both tests, you might be surprised at how little information the test may actually provide you as an individual, particularly in such uncertain times. The purpose of this post is to show why tests (of all sorts) should be interpreted with considerable care, and how increased testing may not always yield useful information to the person being tested.
Now for a little probability theory
We begin with an assumption about the current incidence rate. The incidence rate tells us the average probability of infection at a moment in time–what the risk is to the average individual. Let’s assume that it’s 0.01. This means that 1% of the population is infected, and a randomly selected person has a 1% chance of being infected absent any other information. I’ll refer to this as the baseline probability of infection, or P(infected).
Next, we will assume that the probability of a tested person testing positive is around 2%. This is in the ballpark for a number of jurisdictions at present. It is higher in some areas (in New York, the number is closer to 25%), and lower in some places (in Alberta, it’s less than 2%). This is P(testing positive).
Finally, we’ll assume that the PCR test has a sensitivity of 0.95. This means that it will correctly identify a true positive case 95% of the time. We’ll call this P(test positive|true infected), which is the probability that a person will test positive given that they are truly infected.
With this information, we can understand the meaning of a positive test issued to a person selected randomly from the population, and this information can contribute to our understanding of the level of infection in a population.
However, for the tested person, and the population in general, this information is tricky to interpret.
The first thing to realize is that in spite of the high sensitivity above, a positive test is not 95% accurate. In other words, someone who tests positive is not 95% likely to have the novel coronavirus.
We’re going to use Bayes’ theorem to figure out what the probability of infection is given a test with a 95% sensitivity. Thomas Bayes made one of the most profound discoveries in probability theory (if not science!) about how information can update our understanding about the state of the world. Most simply, the theorem tells how we can update a prior probability with information to get a posterior probability. The posterior probability is the probability we estimate given the information available to us. In this case, the posterior probability is the probability that a person has SARS-CoV-2 given they received a positive test result. The information is the test result. The prior probability is the current incidence rate.
We use the following formula:
to calculate the posterior probability. If we substitute in the values above, we get 0.01 * 0.95 / 0.02 = 0.475. Based on these values, if you get a positive test for SARS-CoV-2, there is less than a 48% chance that you actually are infected!
Interpreting the meaning of a posterior probability is a little tricky, and gets into questions about the very meaning of probability. The posterior is an estimate of subjective probability–the probability assessment we make given what we know. Since what we know varies, this subjective probability can also vary. If you have symptoms typical of covid-19 when you get tested, then we might make adjustments to the formula; for example the baseline incidence of infection with covid-19 is higher among the subset of the population that have a fever and a dry cough. This could increase P(infected) and P(testing positive), and result in a posterior probability much closer to 1.0.
If this all seems overly technical, think of the problem in the following way. If you lived on a desert island and never came into contact with anyone and then took a test for SARS-CoV-2 , would you trust the result? Probably not. Why? Well, because the prior knowledge of your circumstances says that your underlying risk is very, very low–almost 0. You also know that the information from the test is not perfect, so your intuition tells you that it is more likely that the test information is wrong than that you actually have a SARS-CoV-2 infection. This is an extreme example, but illustrates the same general idea:when the uncertainty from the information we receive is high compared to the probability of the outcome that the information predicts, the information is not very useful.
This is important to understand when we consider the prospect of more widespread testing, particularly when using a test with a higher false positive rate. Let’s say that the serological antibody tests have a sensitivity of 0.95, but a higher false positive rate. A higher false positive rate will increase the denominator of the formula above, all else being equal. So it could be that rather than 0.02 (2% of people taking the test are found to be positive) it would be 0.03 (note: I don’t know if this is the right number, so take this example with a grain of salt). This increase is due to the increased number of false positives from serology-based antibody tests. This would yield a posterior probability of around 0.32–meaning that a person with a positive test would have a 32% chance of actually having a a novel coronavirus infection.
Again, from a public health perspective, this issue is not necessarily a problem if experts are careful in their interpretation of data, especially if the purpose of the tests are to estimate past infection (which is typical for serological tests for antibodies). But for tests used to estimate current infections, it means that many people could be told they are infected by SARS-CoV-2 that aren’t, and could endure all the burdens that come with this diagnosis (quarantine, fear, etc.) unnecessarily. It could even lead to reckless behaviour at some point; people who think they’ve had an infection that they have not had may then go into the community with a false sense of security, and get the infection due to the false belief of immunity.
This should also be a reminder about the dangers of individual people demanding medical tests, particularly if the tests have large risks of false positives. Tests have to be interpreted in the context of the person receiving them; if an asymptomatic person is tested, they should probably be warned that a positive test may not be an indication of infection.
Information from tests is much more complicated than most of us realize, and can be downright misleading in some cases. While some folks may think that the gate-keeping of tests (by physicians or governments) is a just another example of big brother trying to save money at the expense of our health, it may sometimes be in our best interests to remain untested, particularly if the tests have high rates of error.
Google recently released the results of an analysis of mobility data. These results are very interesting, especially when cross regional comparisons are made. In brief, these results tell us how much society is using public spaces–like grocery stores, retail outlets and parks–and is an indirect indicator of physical distancing behaviour. Below are a few of my early observations.
Where do we go now?
Regions that have been identified as locations of high levels of infection and mortality appear to have the greatest reduction in visits to retail outlets, workplaces and other public destinations in our communities. Here is a figure for Italy:
What we see is a 94% reduction in Italian visits to retail and recreation locations as of March 29, 2020 compared to a baseline of activity from a few weeks prior. The decline started in early March, when news of Italy’s health care crisis was first emerging.
In contrast, here is a figure for Sweden:
Note that the drop is much less dramatic, and the decline starts around March 8th. It’s worth noting that Sweden has a different take on the covid-19 crisis. There is little mandated physical distancing. Most public health policy is focussed on vulnerable populations and prohibition of large gatherings.
In Canada, it looks like this:
Canada saw a later start to the decline than Italy and Sweden, but has seen a dramatic reduction nonetheless. The US data as a whole look very much like Canada’s, though the magnitude of reduction is a bit lower (a 47% decline from baseline as of March 29th). However, this varies from state to state. In New York, for example:
In Arkansas, on the other hand, the reduction has been less dramatic:
Also interesting is the change in activity by location type. National parks, public beaches, dog parks and public gardens have seen less of a decline in activity over time. In Canada, we see a smaller drop in the use of these park spaces:
Perhaps this makes sense–people need to get out for exercise, and view these spaces as safe (given their spaciousness, and our feelings towards to the healthiness of nature generally). Moreover, in much of Canada, we are coming out of winter, and many people are clamouring for sun.
In Australia, where climate is warmer generally, and seasons are reversed from those in the northern hemisphere, we see a larger decline in use of parks over this period:
What does this all look like in countries where novel coronavirus has been around for a while? Well, here are some figures for South Korea:
We can see here that South Korea did not seem to rely as much on physical distancing for infection control, at least not to the same extent as many countries are doing now. Note that visits to parks and grocery/pharmacy are actually up. I should mention that these comparisons are a little tricky, as we don’t know what happened in early February or January, and I am unsure what Google uses as a baseline for South Korea; if it’s the same for everywhere around the world, it’s hard to know how to interpret the data for countries that experienced the outbreak earlier.
There are no data for mainland China, but there are data for Hong Kong and Taiwan. Here are some figures for Hong Kong:
Based on these data, residents of Hong Kong seem to be pretty consistent in their mobility behaviour for at least the last 6 weeks or so.
What does this all mean?
At this point, it’s very hard to say much about these data beyond what we see from these figures. I wish Google would put this in a table with day-specific records; as it stands, all we can do is look at the graphs, and can’t really analyze many numbers yet. Still, I think these data could be useful in the next few weeks, when we can compare the trends in mobility to trends in cases, testing and deaths (notwithstanding the very low quality of infection data right now). At some point, we could see some connection to mobility behaviour and testing.
However, one observation that could be important here is that the most extreme physical distancing behaviour is a response to crises already in force. New York state and Italy and other areas with a large number of cases and/or deaths take this practice most seriously, but only did this once the the infection was well under way. In other areas, physical distancing has been employed less, but (at least in the case of South Korea…maybe), the infection still flattened with less radical levels of physical distancing behaviour. Maybe something else that Koreans were doing (like wearing masks…?) is what really accounted for the decline in infection. This is all still pretty speculative now, but the data (so far) do not seem inconsistent with that hypothesis.