Why coronavirus predictions might miss the mark

This is the third article in a series looking at the coronavirus pandemic in Malta from an actuarial perspective. Actuaries deal with the management of risk and uncertainty. The first article introduced the potential use of Markov Chains, while the second applied development factors.

The last article looked at a projected total number of COVID-19 cases in Malta should these follow growth patterns in China, Italy, Spain, Singapore or a constant 25.6% rate of growth.

By the end of March, there were 169 confirmed COVID-19 cases in Malta, which was far lower than any of the five predictions (the lowest was Singapore at 187 expected cases). As of April 7, Malta had 293 confirmed cases compared to a prediction of 258 cases if we were to follow Singapore’s growth pattern.

Here I explain why these predictions, and others, will be incorrect at this stage and why asking the question "when will the peak be?" is a bit of a silly question to ask.

Random fluctuations

Since no outcome is certain, there are bound to be some fluctuations in the daily number of cases. For example, a simple prediction method we could use is that the number of confirmed cases tomorrow is equal to the average over the past five days. That means that on Thursday, April 9, we should expect 19.4 cases. However, it would be very reasonable to expect between 5 to 20.

In reality, that number can fluctuate a fair bit due to the natural randomness. The variability is very high, as we have seen anything from 6 to 52 cases per day over the past five days. Actuaries call this ‘process error.’

External Factors

Any random fluctuations are bound to be affected by external factors. Lock-downs, wearing masks, closing schools and other such measures should all affect the growth rate of the pandemic at a varying rate of success. Even the weather could have an effect - either directly (as there might be evidence of less COVID-19 spread in high temperature and high humidity) or indirectly, as many may decide to go for a picnic if the weather is beautiful and thus increase contagion.

Model Error

Pandemics tend to spread exponentially at first and then their rate of growth calms down, at which point the peak would have been reached. In most cases, an equation or model is developed to explain and predict the number of future cases.

The problem is that the wrong model or equation might be used. Take the example of predicting the next day’s number of cases by taking an average from the previous five days - are we sure this is a good model? Maybe we should use a proportion of growth? What if we just randomly select a number instead? All of these are models.

The majority of models being applied assume a peak at some point. But there may be a possible second wave. Singapore's number of cases in its first 20 days of the pandemic would have probably projected a total of fewer than 500 cases by now if using a logistic function. Yet there are now over a thousand confirmed cases to date.

Many experts tend to get over-excited by fitting a model to the data without explaining (or understanding) its limitations. Anyone who is interested in reading a deeper discussion of this can look for discussions on Ersatz Models.

Calibration / Parameter Error

Even if we choose the correct model, many models require some sort of calibration - a parameter. For example, if we assume a percentile growth, we need to calculate that percentile. Should we use all days, or the percentile growth over just the past few days? Or maybe we should benchmark this percentage by experience in another country?

In any case, we should test how final predictions fluctuate if those parameters are changed - that is, how sensitive the model output is to small calibrations. My earlier predictions all gave a best-estimate range, as I was not certain of the parameters used within the models. For example on March 19, I predicted 138 COVID-19 cases by March 23, but with a best estimate range of 93 to 208 cases.

The best-estimate ranges produced tended to be wider at the pessimistic range (208 is 70 more cases than 138 when compared to 93, which is 45 fewer cases). This is due to the skewness of the expected parameters and partially because the down-side risk is higher than up-side risk (in simple words, there is more likelihood of things getting worse than better). The actual number of cumulative cases on the day was of 107.

Conclusion

Does all this mean that we cannot use models to predict COVID-19 cases in Malta? No, it does not mean that we cannot use models. However, it means that we need to appreciate their limitations and how they help us to explain different projections.

The role of the actuary, or any expert, is not simply to fit a model but to explain the deviation of real-life from that model and how to deal with it.

Dominic Cortis is an actuary and lecturer at the University of Malta's department of insurance.