# Case fatality rate conundrum

One of the most important questions about the coronavirus outbreak today is the case fatality rate. Specifically, what is the probability that an infected person will die. Ideally, we’d like to know age-specific case fatality rate (what is the probability of an infected case dying by age). However, in spite of all the data online routinely published online and freely accessible by all, the case fatality rate remains unclear. In fact, it remains very unclear, and many have discussed how early estimates of the case fatality rate were exaggerated 1,2,3. However, what does this uncertainty mean? Does it mean that we are over-reacting to the threat that coronavirus poses? How should we make decisions given this uncertainty?

The problem

I remember when the outbreak first started spreading outside of China, the media said the case fatality rate was 2-4%. This was an estimate based on the the deaths attributed to coronavirus divided by the total positive tests reported by China and other early infected countries. If true, this is a case fatality rate more than 10 times typical seasonal influenza. Given the large number of infections (given the novel and transmissible nature of coronavirus) this suggested millions of people would die from the pandemic.

Since this time, many countries have reported much lower case fatality rates. A month into the pandemic, Germany reported a case fatality rate less than 0.5%. Canada has consistently reported numbers less than 2%, with some provinces (like Alberta) reporting case fatality rates less the 0.5%. Other countries have reported case fatality rates much higher. Italy has reported numbers greater than 10%. This range within Europe alone–from 0.5% to 10% may be explained by differences in health care, underlying risks or demographics, or differences in testing criteria. I have discussed this latter issue in a previous post.

I looked at some of the case fatality rates over time and plotted it out:

Here is the R code I used to generate the data used in this figure. You can copy and paste it into RStudio or equivalent and it will generate a similar graphic for you.

This is a strange looking figure, and while there is much that can be said about it, not much is definitive. Italy has seen a month of steadily rising case fatality rates. The higher case fatality rate in Italy may be due to how they define a coronavirus death. It could also be due to lower testing rate; given the overwhelmed health care system, it seems a plausible explanation. The only testing data from Italy I could find is about a week old, and reports some 200,000 total tests in total, with around 50,000 positive. That’s a very high % of positive tests, suggests that testing is fairly restricted, and that a large number of cases are probably undetected in the population. In Canada, less than 2% of tests come back positive for the coronavirus, and the case fatality rate is much lower. A high case fatality rate paired with a large population of untested positive cases is strong evidence that the case fatality rate is not reflective of the real risk of mortality.

China has been fairly stable–around 4%. If there were very few cases and very few deaths over the last month, this figure makes sense, even if the real case fatality rate has dropped over time; the 4% is largely due to the high rate of deaths earlier in the outbreak, but may not reflect risk of death today.

Germany and Canada have case fatality rates at or below 1.5%. Germany has seen the case fatality rate rise slightly over time. In Canada, the case fatality rate has been fairly stable, though it is creeping up as testing becomes more targetted. Here is a timeline for proportion of positive tests (PPT) and case fatality rate (CFR) for Ontario:

I used different data for the US–from the Covid Tracking project, which tracks cases as well as the frequency of testing. Here I plot the trend in proportion of positive tests and case fatality rate for the whole US:

The case fatality rate has been dropping in the US as testing has decreased–from about 2.5% in early March, to about 1.5% in late March. However, the US still has a very high proportion of positive tests–close to 15% right now, which suggests that the infection is much more widespread than positive tests suggest, or that the US is very judicious in who they test.

What does this mean?

If we control for 1) access to a well functioning health care system, 2) age and 3) pre-existing health status, it’s hard to see how the case fatality rates would vary this much internationally or over time. Even the differences between the US and Canada/Germany can’t be explained by access to health care or differences in epidemiology. Is the average German really less than half as likely to die from a coronavirus infection than the average American? I highly doubt it.

As I stated at the outset, infection rate alone is not adequate for policy decisions. A high infection rate coupled with a low case fatality rate of 0.01% is not a unique public health crisis, and does not justify a social and economic upheaval we’ve seen. The coronavirus outbreak is a problem if case fatality and infection rates are high enough to cause a serious increase in death. Yet, we remain uncertain about what these values actually are. So what do we do?

Making decisions under conditions of uncertainty

For the sake of argument, let’s assume that the real case fatality rate is probably less than 1%. By real I mean that over the entire population, in a country where health care services are available, and where cause of death is directly attributed to the infection, the average infected person has less than a 1% chance of dying from an infection. In age specific terms, older patients have a much higher risk, as do people with pre-existing illnesses. In the very young, the risk of deaths is very low, and the infection may even be less dangerous than the seasonal flu. In healthy middle aged adults, it could be in the 0.5% range on average.

However, there is uncertainty in this estimate, and importantly, the bounds of uncertainty are not symmetrical. These bounds tell us what the expected uncertainty is around this estimate, and acknowledge that there is a range of possible true values that we are not certain about. The graphic below is an estimate of the probability that a given estimate of the true case fatality rate is correct:

This is a ‘ball-park’ figure; I have no idea what that true probability distribution is (I’ve not even labelled the y-axis). The red line is the location of the best estimate–again pure speculation. However, we can be pretty certain of some things. First, any reasonable estimate of this curve has to put a non-zero probability in the right tail, and the tail stretches out–perhaps to 2 or 3%. There is a very, very small chance that the case fatality rate could be above 4%, particularly if the strain evolves to become more virulent (deadly) over time. However, there is no probability that the case fatality rate is 0%, or even 0.01% even if it evolves to become less virulent over time. The data, as flawed as they are, seem to show that this virus is killing people at least as often as seasonal influenza.

Furthermore, the nature of infectious diseases is that our decisions today directly impact the state of the world in the future. If we ignore infection control today, then at some point there is a good chance we will have to deal with the consequences of that decision; we won’t be able to reverse our policy and ‘uninfect’ ourselves. If we assume today that the case fatality rate is 0.05% (and accordingly, take no action), but it is actually 1.5%, then we live with the consequences of that decision of inaction forever.

So, in spite of the weak data, and the complex and contested world of policy making in a time of crisis, there is good reason for policy makers and the public to act as if the coronavirus situation is serious even with the current uncertainty, at least until more data emerge that clarify what the true case fatality rate is.