Python is No Good for Mortality Rates


Here we are at the uptick in the Covid 19 pandemic. There are many sources of data which list infections and deaths as a result of the virus. It’s very tempting to want to put your Python skills to use and crunch some numbers on the infection. By and large, go for it, but one thing I’d ask you not to do is to try to calculate a “mortality rate”. This is not because Python can’t do division but, rather, working this number out is conceptually pretty tricky. It’s something that epidemiologists need to get a lot of training in to do correctly.  You can’t just take the deaths column and divided it by the infected column because the two numbers are not properly related. For example:

  • Testing is incomplete. There is a shortage of test kits where I am. So, if you present with symptoms you will not get tested unless you meet the testing protocol – that is: have you been overseas in the last 14 days; or have you been in contact with a known Covid19 case. This testing protocol means that (most) community transmission of the virus hereabouts is not included in the numbers.
  • There is evidence to believe that a large cohort of those infected are asymptomatic. That is, they have no symptoms or very mild symptoms. If that’s the case, then there is a cohort of infected people who don’t feel unwell, so they don’t get tested and are, also, omitted from the infection numbers.

These factors will mean that naive division of the reported numbers will inflate the mortality rate, making it seem worse, possibly much worse, than it really is (the Economist argues [paywall] that places with extensive testing have much lower death rates – by a factor of 5 or so – simply because they are identifying more of those infected).

Ideally, if you’re going to publish these numbers make it clear what their limitations are.

PS: The Diamond Princess is probably the only cohort to have reliable infection numbers, since everyone was tested before leaving the ship. However, their mortality rate shouldn’t be used as they’re not a representative sample (ie mostly older people who are fit enough and wealthy enough to go on a cruise).

PPS: Further virus related stuff (eg on lag) to be posted on my other not actually python blog.

3 Responses to Python is No Good for Mortality Rates

  1. Pingback: Python 4 Kids: #Python is No Good for Mortality Rates https://pytho… | Dr. Roy Schestowitz (罗伊)

  2. Pingback: Python 4 Kids: Python is No Good for Mortality Rates https://python… | Dr. Roy Schestowitz (罗伊)

  3. Al Shams says:

    Sir please stay safe and well. Best wishes for you and your family.
    I found your book Python for Kid very easy to understand even though the information needs update.
    *****It still is better than other books with fancy name and only good for lifting weights, in my opion. Thank you for your your hard work your book helped his 40+ man to code again in these Covid 19 times. Alshams 2020

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: