This version has preliminary implementations of automatic differentiation and of arithmetic on lists. These are both useful for gradient-based optimization, such as maximum likelihood estimation and neural network training, as well as gradient-based MCMC methods. List arithmetic is helpful when dealing with models that have several groups of parameters, which are most conveniently represented using a list of vectors or matrices, rather than a single vector.

You can read the documentation on these facilities here and here. Some example programs are in this repository. I previously posted about the automatic differentiation facilities here. Automatic differentiation and arithmetic on lists for pqR are both discussed in this talk, along with some other proposals.

For the paranoid, here are the shasum values for the compressed and uncompressed tar files that you can download from pqR-project.org, allowing you to verify that they were downloaded uncorrupted:

c1b389861f0388b90122cbe1038045da30879785 pqR-2020-07-23.tar.gz 04b4586601d8796b12c310cd4bf81dc057f33bb2 pqR-2020-07-23.tar]]>

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period, Science, vol. 368, pp. 860-868, 22 May 2020 (released online 14 April 2020). The paper is also available here, with supplemental materials here.

In my previous post, I talked about how the authors estimate the reproduction numbers (*R*) over time for the four common cold coronavirus, and how these estimates could be improved. In this post, I’ll talk about how Kissler et al. use these estimates for *R* to model immunity and cross-immunity for these viruses, and the seasonal effects on their transmission. These modelling results inform the later parts of the paper, in which they consider various scenarios for future transmission of SARS-CoV-2 (the coronavirus responsible for COVID-19), whose characteristics may perhaps resemble those of these other coronaviruses.

The conclusions that Kissler et al. draw from their model do not seem to me to be well supported. The problems start with the artifacts and noise in the proxy data and *R* estimates, which I discussed in Part 2 and Part 3. These issues with the *R* estimates induce Kissler et al. to model smoothed *R* estimates, which results in autocorrelated errors that invalidate their assessments of uncertainty. The noise in *R* estimates also leads them to limit their model to the 33 weeks of “flu season”; consequently, their model cannot possibly provide a full assessment of the degree of seasonal variation in *R*, which is one matter of vital importance. The conclusions Kissler et al. draw from their model regarding immunity and cross-immunity for the betacoronavirues are also flawed, because they ignore the effects of aggregation over the whole US, and because their model is unrealistic and inconsistent in its treatment of immunity during a season and at the start of a season. A side effect of this unrealistic immunity model is that the partial information on seasonality that their model produces is biased.

After justifying these criticisms of Kissler et al.’s results, I will explore what can be learned using better incidence proxies and *R* estimates, and better models of seasonality and immunity.

The code I use (written in R) is available here, with GPLv2 licence.

I’ll start by presenting what Kissler et al. did in their paper — how they modelled *R* for the common cold coronaviruses, how they interpreted the results of fitting their model, and how they think it might relate to SARS-CoV-2 (the virus causing COVID-19). The paper focuses on the two betacoronaviruses that cause common colds — OC43 and HKU1 — since SARS-CoV-2 is also a betacoronavirus.

Kissler et al. model log(*R*) for each virus, for each of the 33 weeks of each year’s “flu season” — from week 40 of a year (early October) to week 20 of the next year (mid May). Here, *R* is the smoothed estimate of *R _{u}*, as I discussed in Part 3 of this series (I’ll refer to this as just

- Indicators for each season and virus, whose coefficients will give the values of
*R*for each virus at the start of each season. - A set of 10 spline segments, used to model the change in R due to seasonal effects (assumed the same for all years, and for both viruses).
- The cumulative incidence of the virus for which log(R) is being modelled, within the current season (using the chosen incidence proxy). This cumulative incidence is seen as a measure of the “depletion of susceptibles” due to immunity.
- The cumulative incidence of the other virus, within the current season. This is seen as allowing the model to capture cross-immunity due to infection by the other common cold betacoronavirus.

Here Fig. 1 from their paper, showing the effects of these components of the model, using the regression coefficients fitted to the data (click for larger version):

These plots show the multiplicative effects on *R*, derived from the additive effects on log(*R*) in the linear regression model. The gold line shows the seasonal effect. The red and blue lines are the “immunity” effects, for HKU1 and OC43 respectively. (Hence, for HKU1, red is the effect of immunity to HKU1, and the blue line is the effect of cross-immunity from infection with OC43, whereas for OC43, this is reversed, with red showing the cross-immunity effect from HKU1.) The black dot at the left is the initial *R* value, relative the value for HKU1 in 2014-2015. The plots include uncertainty bands for these effects.

I have replicated these results (except for the uncertainty bands). Below are my plots analogous to those above (click for larger version):

The two columns on the left above have plots of the multiplicative effects of each model component, corresponding to Kissler et al.’s plots. The two right columns show plots of the additive effects on log(*R*). The plots above also include the absolute level of the initial value of R (green dot), the total effect from all components in the model (red line), and the *R* estimates that the model was fitted to (pink dots, some of which may be off the plot). I plot the effect of cumulative seasonal incidence of the same virus as a black line, and the “cross-immunity” effect from cumulative seasonal incidence of the other virus as a grey line.

Kissler et al. truncated their plots at week 30 of flu season (as I discussed in Part 1). My plots above go to the end of the season (week 33).

For closer examination, here is the portion of Kissler et al.’s Fig. 1 for HKU1 in the 2017-2018 season:

Here is the corresponding plot from my replication:

One can see an exact correspondence, up to week 30. One can also see that truncating the plot at week 30 leaves out a significant rise in Kissler et al.’s estimate of the seasonal effect on *R* in the last three weeks.

Here are the regression coefficients from the model fit in my replication:

Estimate Std. Error HC3.se NW.se seasonal_spline1 0.45088 0.10610 0.22478 0.16287 seasonal_spline2 0.24535 0.08779 0.15137 0.23198 seasonal_spline3 0.48011 0.08539 0.16118 0.14166 seasonal_spline4 0.07054 0.08333 0.14408 0.16536 seasonal_spline5 0.35121 0.10858 0.16415 0.15480 seasonal_spline6 0.04285 0.14018 0.17968 0.20912 seasonal_spline7 0.14935 0.15413 0.19551 0.19489 seasonal_spline8 0.09484 0.16603 0.19778 0.25869 seasonal_spline9 0.25369 0.16668 0.20295 0.20424 seasonal_spline10 0.29357 0.15608 0.20286 0.21917 OC43_same -0.00187 0.00051 0.00044 0.00061 OC43_other -0.00076 0.00044 0.00039 0.00056 HKU1_same -0.00251 0.00044 0.00044 0.00060 HKU1_other -0.00133 0.00051 0.00048 0.00063 OC43_season_2014 0.06825 0.05986 0.12802 0.14079 OC43_season_2015 -0.01195 0.06320 0.13376 0.13974 OC43_season_2016 0.04066 0.06053 0.12648 0.14115 OC43_season_2017 0.06951 0.06569 0.12982 0.14785 OC43_season_2018 0.06298 0.06090 0.12586 0.13743 HKU1_season_2014 0.12779 0.05986 0.17615 0.16235 HKU1_season_2015 0.01084 0.06320 0.14456 0.15355 HKU1_season_2016 0.15110 0.06053 0.15803 0.15721 HKU1_season_2017 0.21279 0.06569 0.13662 0.14642 HKU1_season_2018 0.19568 0.06090 0.15752 0.15977 Residual standard deviation for OC43: 0.093 Residual ACF: 1 0.66 0.18 -0.05 0 0.06 -0.07 -0.26 -0.36 -0.3 -0.17 -0.08 Residual standard deviation for HKU1: 0.2061 Residual ACF: 1 0.55 -0.08 -0.29 -0.11 0 -0.07 -0.18 -0.13 0.02 0.14 0.08

Here, “same” and “other” refer to the immune effects of the same or the other virus. The coefficients above match those in Table S1 in the supplementary information for the Kissler et al. paper, except they use a different (but equivalent) parameterization in which values for OC43 and seasons other than 2014 are relative to HKU1/2014.

The standard deviation of the residuals of this model for HKU1 is over twice that for OC43. This is a problem, since the model fits data for the two viruses jointly, assuming a common value for the residual standard deviation. The error bands in Kissler et al.’s plot are found using a procedure that is supposed to adjust for differing residual variances — either the residuals having higher variance for HKU1 than OC43, or to their variance varying with time. These adjusted standard errors are shown in the `HC3.se` column. This adjustment has a substantial effect on the standard errors for the spline components, but not much effect on the standard errors for the immunity and cross-immunity coefficients. (I will talk about the `NW.se` column later.)

The 95% confidence bands for the immunity effects in Kissler et al.’s Fig. 1 correspond roughly to plus-or-minus twice the HC3 standard errors of the coefficients as shown above. For example, the coefficient of -0.00133 for cross-immunity of HKU1 to OC43 has a standard error of 0.00048, so the 95% confidence interval for the coefficient is approximately -0.00229 to -0.00037. The cumulative incidence of OC43 at the end of the 2017-2018 season was about 160. Exponentiating to get the multiplicative cross-immunity effect gives a range of exp(-0.00229*160) to exp(-0.00037*160), which is about 0.69 to 0.94, matching the end-of-season range in Kissler et al.’s plot.

Kissler et al. interpret the results of this model as follows:

As expected, depletion of susceptibles for each betacoronavirus strain was negatively correlated with transmissibility of that strain. Depletion of susceptibles for each strain was also negatively correlated with the R

_{e}of the other strain, providing evidence of cross-immunity. Per incidence proxy unit, the effect of the cross-immunizing strain was always less than the effect of the strain itself (table S1), but the overall impact of cross-immunity on the R_{e}could still be substantial if the cross-immunizing strain had a large outbreak (e.g., HCoV-OC43 in 2014–2015 and 2016–2017). The ratio of cross-immunization to self-immunization effects was larger for HCoV- HKU1 than for HCoV-OC43, suggesting that HCoV-OC43 confers stronger cross-immunity. Seasonal forcing appears to drive the rise in transmissibility at the start of the season (late October through early December), whereas depletion of susceptibles plays a comparatively larger role in the decline in transmissibility toward the end of the season. The strain-season coefficients were fairly consistent across seasons for each strain and lacked a clear correlation with incidence in prior seasons, consistent with experimental results showing substantial waning of immunity within 1 year (15).

Later, discussing the results of using the same data to fit a SEIRS model (which I’ll discuss in a future post), they comment that

According to the best-fit model parameters, the R

_{0}for HCoV-OC43 and HCoV-HKU1 varies between 1.7 in the summer and 2.2 in the winter and peaks in the second week of January, consistent with the seasonal spline estimated from the data. Also in agreement with the findings of the regression model, the duration of immunity for both strains in the best-fit SEIRS model is ~45 weeks, and each strain induces cross-immunity against the other, although the cross-immunity that HCoV-OC43 infection induces against HCoV-HKU1 is stronger than the reverse.

The summer/winter *R _{0}* ratio of 1.7/2.2=0.77 in this SEIRS model does indeed closely match the low/high ratio of seasonal effects in the regression model, which can be seen above to be approximately 1.1/1.45=0.76. And a duration of 45 weeks for immunity is consistent with the regression model’s assumption that immunity does not decline during the 33 week flu season, but is mostly gone by the start of the next season (19 weeks later), as evidenced by the range of coefficients for

I summarize the conclusions Kissler et al. draw from this regression model as follows:

- Both betacoronaviruses induce immunity against themselves, but this immunity is mostly gone less than one year later.
- Both viruses also confer substantial cross-immunity against the other (larger for OC43 against HKU1 than the reverse), though this is not as strong as self-immunity.
- There is a seasonal effect on transmission, whose magnitude reduces
*R*by a factor of about 0.76 in low season compared to the peak.

The model also produces an estimated curve for the seasonal effect (within flu season), but apart from pointing to its rise at the start of flu season as initiating seasonal outbreaks, Kissler et al. perhaps do not attach great significance to the detailed shape of this estimated curve, since the error bands they show around it are quite large. However, they do use this model as evidence regarding the overall magnitude of the seasonal effect, which, to the extent it might be indicative of seasonality for COVID-19, is of great practical importance.

Mostly no, I think. I don’t mean that the conclusions are necessarily all *false* (though some may be), just that, for several reasons, modelling the data that they use in the way that they do does not provide good evidence for their conclusions.

Some of the problems originate in the proxy they use for incidence of coronavirus infection. As I discussed in Part 2, their proxy, which I call proxyW, has artifacts due to holidays, some points affected by outliers, an issue with zero measurements, and noise that is best reduced by smoothing the incidence proxy rather than later. I proposed several proxies that seem better.

These problems with the incidence proxy influence estimation of weekly *R* values for each virus. I argued in Part 3 that modelling *R _{t}* is better than modelling

Furthermore, the handling of immunity in the regression model Kissler et al. fit would be appropriate for modelling *R _{t}* but not

Since *R _{u}* averages values of

If we try modelling *R _{t}* with Kissler et al.’s regression model, here is what we get for HKU1 in 2017-2018:

The artifacts and noise in the *R _{t}* estimates produce clearly ridiculous results, with strong negative cross-immunity, and a seasonal effect that has no resemblance to what we would expect.

The results of applying Kissler et al.’s model to the unsmoothed *R _{u }*are not so bad. Below is a comparison of Kissler et al.’s model for smoothed

The differences are noticeable, but not drastic. The rise in the seasonal effect at the start of the season is steeper with unsmoothed *R _{u}* values, so the whole seasonal curve is shifted up (partly compensated for by the green dot shifting down). The rise at the end of the season is even larger than before. The immunity and cross-immunity effects are slightly larger.

The results using the unsmoothed *R _{u}* values are a better guide to what this model is saying. These results are somewhat unexpected — rather than the sharp rise in the seasonal effect seen at the end of flu season, one might instead expect a drop, and the very steep rise at the beginning of the flu season is also odd, since nothing of apparent note happens then (schools open earlier). These oddities should prompt one to critically examine both the data and the assumptions behind the model. Somewhat obscuring the problems by using smoothed

Further insight comes from looking in detail at the estimates found using unsmoothed *R _{u}* values:

Estimate Std. Error HC3.se NW.se seasonal_spline1 0.57484 0.18245 0.31483 0.21023 seasonal_spline2 0.31601 0.15096 0.25117 0.29209 seasonal_spline3 0.58870 0.14684 0.20252 0.17467 seasonal_spline4 0.13309 0.14331 0.18339 0.20336 seasonal_spline5 0.48668 0.18671 0.18977 0.19179 seasonal_spline6 0.14439 0.24106 0.22037 0.23593 seasonal_spline7 0.27262 0.26504 0.23446 0.24860 seasonal_spline8 0.24716 0.28551 0.24667 0.27696 seasonal_spline9 0.31697 0.28663 0.25691 0.25880 seasonal_spline10 0.48711 0.26841 0.26182 0.27002 OC43_same -0.00204 0.00088 0.00063 0.00060 OC43_other -0.00090 0.00076 0.00056 0.00056 HKU1_same -0.00274 0.00076 0.00068 0.00066 HKU1_other -0.00145 0.00088 0.00074 0.00063 OC43_season_2014 -0.00951 0.10294 0.14988 0.16930 OC43_season_2015 -0.10040 0.10868 0.15427 0.16894 OC43_season_2016 -0.04292 0.10410 0.14776 0.16883 OC43_season_2017 -0.00409 0.11297 0.15312 0.17150 OC43_season_2018 -0.01522 0.10473 0.14967 0.16304 HKU1_season_2014 0.04350 0.10294 0.21745 0.21514 HKU1_season_2015 -0.07347 0.10868 0.16365 0.18131 HKU1_season_2016 0.05114 0.10410 0.20899 0.18428 HKU1_season_2017 0.14614 0.11297 0.16333 0.17836 HKU1_season_2018 0.11115 0.10473 0.20682 0.19546 Residual standard deviation for OC43: 0.1415 Residual ACF: 1 0.28 -0.23 -0.04 0.05 0.02 0.04 -0.15 -0.22 -0.12 -0.08 -0.03 Residual standard deviation for HKU1: 0.3622 Residual ACF: 1 0.17 -0.37 -0.17 0.03 0.08 -0.01 -0.13 -0.09 0.01 0.11 0.08

Notice that the estimate for cross immunity from OC43 to HKU1 (given by `HKU1_other` above) has increased slightly (compared to using smoothed *R _{u}* values), but its magnitude is now less than twice its standard error, and hence is not significant at the conventional 0.05 level — that is, by the conventional criterion, we cannot be confident that there actually is any cross-immunity. The estimate for cross immunity in the other direction, though again larger, is also less significant (it was marginal before), because its standard error is now larger.

When properly computed, the standard errors of estimates in the model for smoothed *R _{u}* values that Kissler et al. fit are also large enough that zero cross immunity (in either direction) is plausible. The residuals for that model have significant autocorrelation at lag one (0.66 for OC43, 0.55 for HKU1), which tends to increases the standard errors. The

The autocorrelations of the residuals are at least partly the result of fitting smoothed values for *R _{u}* — the smoothed values are averages over three weeks, with two of these weeks being common to two adjacent values, inducing autocorrelation. The estimated lag-one autocorrelations for residuals in the model of unsmoothed

Since the estimated cross-immunity effects are at most marginally significant, the conclusions Kissler et al. draw about cross-immunity do not seem justified.

The estimated curve for the seasonal effect on transmission seems rather strange. Perhaps the abrupt rises at the beginning and end of the flu season are artifacts of the handling of end-points by the spline procedure used (I have not investigated this enough to be sure). Or perhaps a strange result should not be seen as surprising, since the error bands for the seasonal effect curve in Kissler et al.’s Fig. 1 are quite large, and might be even larger if the effect of autocorrelation in the residuals were accounted for. But on that view, the uncertainty in the inferred seasonal effect is so large that no useful conclusions can be drawn.

Kissler et al.’s conclusion that each virus induces self-immunity might seem justified, since the estimates for `OC43_same` and `HKU1_same` are much bigger than twice their standard errors, when using either the smoothed or unsmoothed values for *R _{u}*, and with any of the methods used to compute standard errors. That these viruses induce some immunity is well supported from other evidence in any case. However, caution is warranted when using this model to judge the strength of immunity, or its duration. Using terms such as “depletion of susceptibles” and “self-immunity” in reference to covariates and regression coefficients in a model does not mean that these features of the model

The data being modelled is aggregated over the entire United States. On the NREVSS regional page, one can see the regional breakdown for positive common cold coronavirus tests in the last two years. For OC43, the peaks in the 2018-2019 season in the four regions occur on January 12 (Northeast), January 26 (Midwest), January 5 (South), and December 22 (West) — a spread of five weeks. The “depletion of susceptibles” in the West will have little influence on transmission in the Northeast. There must be geographical variation on a finer scale as well. It is therefore not safe to interpret the regression coefficient on “depletion of susceptibles” as an indication of strength of immunity.

The model used by Kissler et al. assumes that “immunity” persists undiminished during flu season, but that it is largely gone by the start of the next season. This seems implausible — why would immunity that had lasted for 20 or more weeks during the flu season be almost completely gone after another 19 weeks? More plausibly, the “immunity” (i.e., the regression coefficient on “depletion of susceptibles”) actually starts to decline during the flu season, not because the immune response of infected individuals declines, but because the locus of the outbreak moves from initially infected geographical regions to other regions, whose susceptibles have not yet been depleted. If a model does not include such a decline in “immunity”, the effect the decline would have had may be be produced instead by another component of the model. This may explain why the seasonal effect goes up after week 22 of the season (early March) — as a substitute for the “immune” effect becoming less. If this is happening, then the magnitude of the seasonal effect in Kissler et al.’s model underestimates the true seasonal effect.

One way of trying to get more useful results than Kissler et al. do is to use better *R* estimates, based on better proxies for incidence of the betacoronaviruses, as I discussed in Part 2 and Part 3 of this series.

The proxyAXss-filter *R* estimates differ from the proxyW estimates by:

- Using the ratio of ILI visits to non-ILI visits as a proxy for Influenza-Like Illness rather than the (weighted) ratio of ILI visits to total visits.
- Removing artifacts due to holidays by interpolating from neighboring non-holiday weeks.
- Replacing some outliers in percent positive tests for coronaviruses by reasonable values.
- Replacing zero values for percent positive tests by reasonable non-zero values.
- Smoothing the percentage of positive tests for coronaviruses due to each virus.
- Applying a smoothing filter to the proxies after interpolating to daily values.

Modelling the unsmoothed *R _{u}* estimates from proxyAXss-filter gives the following results for 2017-2018:

The sharp rises in the seasonal effect at the beginning of the season is less than for the model of unsmoothed *R _{u}* values with proxyW (shown above). There is some autocorrelation at lag one in the residuals (0.53 for both viruses), perhaps due to the smoothing done for the proxies (though

Since proxyAXss-filter has less noise, somewhat reasonable results are obtained when modelling *R _{t}*:

I described several other approaches to reducing noise in the proxies in Part 2. Here is a plot from a model using the proxyDn estimates for *R _{t}*:

The estimates for this model are as follows:

Estimate Std. Error HC3.se NW.se seasonal_spline1 0.08769 0.06043 0.06199 0.03482 seasonal_spline2 0.14846 0.04995 0.04751 0.04079 seasonal_spline3 0.31573 0.04869 0.05294 0.05572 seasonal_spline4 0.08249 0.04499 0.05382 0.06426 seasonal_spline5 0.25449 0.05238 0.06374 0.08768 seasonal_spline6 -0.12442 0.06334 0.07115 0.11831 seasonal_spline7 -0.09949 0.06855 0.07913 0.13046 seasonal_spline8 -0.20727 0.07549 0.08118 0.13549 seasonal_spline9 -0.03293 0.07526 0.07716 0.12224 seasonal_spline10 -0.01487 0.06734 0.07468 0.13326 OC43_same -0.00123 0.00019 0.00018 0.00037 OC43_other -0.00047 0.00020 0.00026 0.00055 HKU1_same -0.00252 0.00020 0.00022 0.00040 HKU1_other 0.00002 0.00019 0.00018 0.00035 OC43_season_2014 0.14325 0.03397 0.04386 0.05663 OC43_season_2015 0.11544 0.03504 0.03894 0.04554 OC43_season_2016 0.11594 0.03423 0.03661 0.02958 OC43_season_2017 0.15530 0.03713 0.04109 0.05545 OC43_season_2018 0.18876 0.03579 0.03744 0.03144 HKU1_season_2014 0.03563 0.03397 0.04567 0.07521 HKU1_season_2015 0.11766 0.03504 0.04319 0.05390 HKU1_season_2016 0.02589 0.03423 0.03686 0.03899 HKU1_season_2017 0.29256 0.03713 0.03947 0.04218 HKU1_season_2018 0.10250 0.03579 0.03600 0.03676 Residual standard deviation for OC43: 0.09 Residual ACF: 1 0.63 0.31 0.22 0.12 0.03 -0.03 -0.12 -0.19 -0.22 -0.29 -0.34 Residual standard deviation for HKU1: 0.0922 Residual ACF: 1 0.63 0.31 0.21 0.09 -0.03 -0.09 -0.15 -0.17 -0.14 -0.14 -0.15

There is significant autocorrelation, due to the smoothing of proxies, so the Newey-West standard errors should be used.

The coefficient for cross-immunity from OC43 to HKU1 is virtually zero. The coefficient for cross-immunity from HKU1 to OC43 only about one standard error away from zero. This confirms that a model of the sort used by Kissler et al. provides no evidence of cross-immunity between the two betacoronaviruses, when it is properly fit to cleaned-up data. There is certainly no support for the claim by Kissler et al. that cross-immunity from OC43 against HKU1 is stronger than the reverse.

That this model fails to show cross-immunity is not really surprising. Because the data is aggregated over all regions of the United States, it is quite likely that many of those infected with one of the viruses are in regions where the other virus is not currently spreading. In the long term, both viruses might induce immunity in people in every region, and cross-immunity between might be evident in that context, but this model is sensitive only to cross-immunity on a time scale of a few months.

For HKU1, the estimated coefficients of the indicators of the five seasons differ quite a bit. Moreover, the largest coefficient, 0.29256 for the 2017-2018 season, follows the season with the smallest incidence for HKU1, as seen here:

The second highest coefficient, 0.11766 for the 2015-2016 season, follows the season with the second-smallest incidence. Contrary to the conclusions of Kissler et al., it seems quite plausible that these seasonal coefficients reflect the level of continuing immunity from the previous season.

To learn more from this data, we need to use a better model than Kissler et al.’s. I’ll confine myself here to models that mostly use the same techniques that Kissler et al. use — that is, linear regression models for log(*R*), fit by least squares.

Getting a better idea of the seasonal effect on transmission is important. To do this, the model must cover the whole year, including summer months when the seasonal reduction in transmission is perhaps at its strongest. The bias in estimation of the seasonal effect that can result from using an unrealistic “immunity” model also needs to be eliminated. Of course, a good model for immunity and cross-immunity is also of great interest in itself.

It’s easy to extend the “flu season” to the whole year — just set the season start to week 28 (early July) and the season end to week 27 of the next year. If that’s all that’s done, however, the results are ridiculous, since the model then assumes that immunity abruptly jumps to zero between week 27 and week 28.

I’ve tried several variations on more realistic models of immunity. Some of the more complex models that I tried did not work because their multiple factors cannot easily be identified from the data available, at least when working in the least-squares linear regression framework.

For example, I tried to correctly model that immunity should be represented by a term in the expression for log(*R*) of the form log(1-*d*/*T*), where *d* is a proxy for the number of immune people, and *T* is the total population, scaled by the unknown scaling factor for the proxy *d*. This corresponds the a multiplicative factor of (1-*d*/*T*) for *R*, which represents the decline in the susceptible population. If *d*/*T* is close to zero, log(1-*d*/*T*) will be approximately –*d*/*T*, and the model can be fit by including *d* as a covariate in the linear regression, with the regression coefficient found corresponding to -1/*T*. This is what is implicitly done in Kissler et al.’s model.

But when considering long-term immunity, assuming that *d*/*T* is close to zero seems unrealistic. I therefore tried including log(1-*d*/*T*) as a covariate, expecting its regression coefficient to be close to one, and then chose *T* by a search procedure to minimize the standard deviation of the model residuals. Unfortunately, there are too many ways that parameters in this model can trade-off against each other (e.g., *T* and the regression coefficient for log(1-*d*/*T*)), producing results in which these parameters do not take on their intended roles (even if the data is modelled well in terms of residual standard deviation). I therefore abandoned this approach, and am instead hoping that even if *d*/*T* is not small, its variation may be small enough that log(1-*d*/*T*) is approximatley linear in the region over which *d*/*T* varies.

The immunity model I tried that seems to work best assumes that there is both an “immunity” of short duration (actually caused by the data being aggregated over regions where infection spreads at different times) and long-term immunity (which is intended to represent widespread immunity in the population). Both forms of immunity are modelled using exponentially-weighted sums of the incidence proxies for each virus, but with different rates of decay in “immunity”. For short-term immunity, the incidence *k* weeks ago is given weight 0.85^{k}, which corresponds to decay of immunity by a factor of two after 4 weeks. For long-term immunity, the weight for the incidence *k* weeks ago is 0.985^{k}, which corresponds to decay of immunity by a factor of two after 46 weeks. These two exponentially weighted sums for a virus are included as covariates in the regression model for that virus, and the long-term sum is included as a covariate for the other virus, allowing cross-immunity to be modelled.

The way seasonality is modelled also needs to change, to avoid a discontinuity at the end of one full-year season and the beginning of the next. A periodic spline would be a possibility, but the `bs` function used in Kissler et al.’s R model does not have that option. I used a Fourier representation of a periodic seasonality function, including covariates of the form sin(2π*fy*) and cos(2π*fy*), where *y* is the time in years since the start of the data series, and *f* is 1, 2, 3, 4, 5, or 6. This seasonality model is common to both viruses.

Finally, I included a spline to model long term trends (e.g., change in population density), which is also common to both viruses, and a overall offset for each virus (same for all years). The long-term trend spline was limited to 5 degrees of freedom so that it will not be able take over the modelling of seasonality or immunity effects.

Here are the results of fitting this model using proxy data proxyEn-daily, which is the proxy with the most smoothing / least noise (see Part 3), for 2017-2018 (start of July to end of June):

The dotted vertical lines show the start and end of “flu season”, to which the previous models were limited. The green line is the long-term trend; the dark green dot on the left is the overall offset for that virus. The solid black line is the short-term immunity effect; the dashed black line is the long-term immunity effect. The dashed grey line is the long-term cross-immunity effect. As before, the orange line is the seasonal effect, the red line is the sum of all effects, and the pink dots are the estimates for *R _{t}* that the red line is fitted to.

The estimated seasonal effect on log(*R*) has a low-to-high range of 0.44, which corresponds to R being 1.55 times higher in mid-November than in mid-June. The shape of the seasonal effect curve during the “flu season” is quite similar to the shape found using the model limited to flu season, with proxyDn estimates for *R _{t}* (shown above), though the range is slightly greater, and the rise at the end of flu season is smaller.

For both viruses, the plot shows both short and long term self-immunity effects. The plot shows some cross-immunity from HKU1 to OC43, but little indication of cross-immunity from OC43 to HKU1.

Here are the plots for all years, in both additive and multiplicative form (click for larger version):

Here are the estimates for the regression coefficients, and their standard errors:

Estimate Std. Error HC3.se NW.se trend_spline1 -0.07411 0.04928 0.05926 0.29550 trend_spline2 -0.06561 0.02890 0.03710 0.18423 trend_spline3 -0.09946 0.04761 0.06531 0.33427 trend_spline4 0.27051 0.03097 0.04304 0.23808 trend_spline5 0.02323 0.05161 0.08320 0.29286 sin(1 * 2 * pi * yrs) 0.11735 0.01395 0.01832 0.07656 cos(1 * 2 * pi * yrs) -0.16124 0.00625 0.00746 0.02524 sin(2 * 2 * pi * yrs) -0.02905 0.00580 0.00617 0.02275 cos(2 * 2 * pi * yrs) 0.02184 0.00476 0.00541 0.01813 sin(3 * 2 * pi * yrs) -0.00949 0.00422 0.00391 0.00681 cos(3 * 2 * pi * yrs) -0.01828 0.00430 0.00471 0.00860 sin(4 * 2 * pi * yrs) 0.01808 0.00414 0.00394 0.00785 cos(4 * 2 * pi * yrs) -0.02917 0.00419 0.00429 0.00722 sin(5 * 2 * pi * yrs) -0.01345 0.00412 0.00425 0.00560 cos(5 * 2 * pi * yrs) -0.00494 0.00417 0.00406 0.00527 sin(6 * 2 * pi * yrs) 0.01211 0.00412 0.00436 0.00509 cos(6 * 2 * pi * yrs) 0.00014 0.00416 0.00405 0.00290 OC43_same -0.00136 0.00020 0.00020 0.00071 OC43_samelt -0.00099 0.00019 0.00028 0.00097 OC43_otherlt -0.00098 0.00013 0.00020 0.00098 HKU1_same -0.00060 0.00024 0.00030 0.00104 HKU1_samelt -0.00208 0.00015 0.00019 0.00091 HKU1_otherlt -0.00021 0.00017 0.00022 0.00094 OC43_overall 0.52199 0.07282 0.10294 0.39844 HKU1_overall 0.37802 0.07037 0.09145 0.39340 Sum of OC43_same and OC43_samelt : -0.00235 NW std err 0.00095 Sum of HKU1_same and HKU1_samelt : -0.00268 NW std err 0.00082 Seasonality for week50 - week20 : -0.44075 NW std err 0.10621 Residual standard deviation for OC43: 0.0625 Residual ACF: 1 0.93 0.76 0.54 0.32 0.13 -0.01 -0.1 -0.15 -0.18 -0.21 -0.23 Residual standard deviation for HKU1: 0.0665 Residual ACF: 1 0.89 0.66 0.39 0.12 -0.08 -0.21 -0.26 -0.27 -0.24 -0.21 -0.17

There is substantial residual autocorrelation, so the Newey-West standard errors should be used.

Looking at the immunity coefficients individually, only `HKU1_samelt` has magnitude more than twice its standard error (`OC43_same` is close). This is somewhat misleading, however, since the short and long term immunity terms can trade-off against each other — a lower level of short-term immunity can be compensated to some extent by a higher level of long-term immunity, and vice versa. I computed the standard error for the sum of the short and long term coefficients, finding that for both viruses the magnitude of this sum is greater than twice its standard error, providing clear evidence that these viruses induce immunity against themselves.

The cross-immunity coefficients are much less than twice their standard error, so this model provides little or no evidence of cross immunity for these viruses.

As a robustness check, here is the same model fitted to the proxyAXss-filter data, which is less smoothed, with the minimal “fixups” needed to get reasonable results:

The curve for the seasonal effect is very similar to the fit using proxyEn-daily. The long-term immunity and cross-immunity effects look considerably different. Despite this, the red curve, giving the fitted values for log(*R*), is very similar. The changes in the immunity curves are mostly compensated for by changes in the offset for each virus (dark green dots) and in the long-term trend line (green) — the variation due to immunity and cross-immunity is similar.

As another robustness check, here are the results of applying the same model to the two alphacoronaviruses (click for larger version):

The seasonal effect for these viruses is similar to that for the betacoronaviruses. This is not a completely independent result, however, since some data is shared between the two virus groups in the construction of the proxies.

For further insight into the meaning of these estimates and standard errors, we can look at the incidence of these viruses, according to proxyEn:

For OC43, substantial peaks of incidence occur every year, but large peaks for HKU1 occur in only two of the five years. Because of this, the long-term exponentially-weighted incidence sum is more variable for HKU1 than for OC43, which means that more information is available about the immunity effects for HKU1 than for OC43.

It’s also important to keep in mind that there are only five years of data. Just from looking at the incidence plots above, one might think that the highest peak for HKU1, in 2017-2018, is a consequence of the almost non-existent peak in 2016-2017, which would result in fewer immune people the following year. However, the large 2017-2018 peak might also be caused in part by a generally increasing upward trend, which might also be discerned in the incidence for OC43. The possible trade-offs between these effects increase the uncertainty in inferring immunity and cross-immunity.

Beyond the formal indication of uncertainty captured by the standard errors, there is unquantified uncertainty stemming from the methods used to construct the proxies. These uncertainties could be captured formally only by a different approach in which the process by which the proxy data is produced is modelled in terms of latent processes, including the transmission dynamics of the viruses. There is also uncertainly stemming from my informal model selection procedure, which might be “overfitting” the data, or incorporating my preconceptions.

Taking these uncertainties into account, my conclusions from this modelling exercise are as follows:

- Both betacoronaviruses induce immunity against themselves. This immunity probably persists to a significant extent for at least one year. There is likely some apparent short-term immunity that is really a reflection of the fact that the data being modelled is aggregated over the entire United States.
- There may be some cross-immunity between the two betacoronaviruses, but this is not clearly established from this data.
- There is good evidence that the season affects transmissibility of these viruses. The 95% confidence interval for the ratio of the minimum to the maximum seasonal effect is (0.52,0.80). (This is found as exp(-0.44075±2*0.10621) using the estimate and standard error above.) One should note that this is an average seasonal effect over the entire United States; it is presumably greater in some regions and less in others.

With this post, I’ve completed my discussion of the first part of the paper. Somewhat strangely, the results from this part, whose validity I question above, are not *directly* used in the rest of the paper — they instead just provide a sanity check for the results of fitting a different model to the same proxy data on incidence, and as one input to the possible scenarios for the future that are considered. The second model that Kissler et al. fit is of the SEIRS (Susceptible-Exposed-Infected-Recovered-Susceptible) type. After being used to model the common cold betacoronaviruses, this SEIRS model is extended to include SARS-CoV-2, with various hypothesized characteristics. Simulations with these extended models are the basis for Kissler et al.’s policy discussion.

I’ll critique this SEIRS model for the common cold betacoronaviruses in a future post. But my next post will describe how the full-year regression models for *R _{t}* that are described above can easily be converted to dynamical models, that can be used to simulate from a distribution of hypothetical alternative histories for the 2015 to 2019 period. The degree to which these simulations resemble reality is another indication of model quality.

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period, Science, vol. 368, pp. 860-868, 22 May 2020 (released online 14 April 2020). The paper is also available here, with supplemental materials here.

In this post, I’ll look at how the authors estimate the reproduction numbers (*R*) over time for the four common cold coronavirus, using the proxies for incidence that I discussed in Part 2. These estimates for *R* are used to model immunity and cross-immunity for these viruses, and the seasonal effects on their transmission. These modelling results inform the later parts of the paper, in which they consider various scenarios for future transmission of SARS-CoV-2 (the coronavirus responsible for COVID-19), whose characteristics may perhaps resemble those of these other coronaviruses.

I will be using the code (written in R) available here, with GPLv2 licence, which I wrote to replicate the results in the paper, and which allows me to more easily produce plots to help understand issues with the methods, and to try out alternative methods that may work better, than the code provided by the authors (which I discussed in Part 1).

The reproduction number, *R*, for a virus is the average number of people that a person infected with the virus will go on to infect. A more detailed description of transmission is provided by the generation interval distribution, the probability distribution for the time interval, *a*, from a person being infected to them infecting another person, denoted by *g(a)*. The expected number of people infected at time *t _{2}* by a particular person infected at time

Here, I have assumed that time is an integer, measuring days, so *g(a)* is a discrete distribution over positive integers. It’s common in the epidemiology literature to treat time as continuous, but this would seem to make little sense unless one is modelling the likely quite complex details of daily cycles in viral physiology and human behaviour, which would lead to complex variation in *R* and *g* within each day. As far as I can tell, nobody does this.

It’s expected that R will change over time, both because the fraction of the population that is immune to infection may change, and because environmental influences on transmission of the virus may change, perhaps as the season changes, or as people alter their behaviour to avoid infection. It may also be that *g* changes with time, but handling that seems to be sufficiently difficult that the possibility is typically ignored.

When *R* is changing, there are at least two natural ways of defining its value at a given time. One is the average number of people that a person infected at time *u* will go on to infect, which I (and Kissler, et al.) denote by *R _{u}*. Looking backwards gives another definition, which I will denote by

Kissler, et al. reference a paper by Wallinga and Lipsitch which gives formulas for *R _{t}* and

The Kissler, et al. paper estimates *R _{u}* and then tries to model these estimates. Although

I will look at estimates for both *R _{t}* and

The formulas for *R _{t}* and

As discussed below, a suitable generative interval distribution for these coronaviruses puts significant probability on intervals of only a few days, so switching to a discrete-time representation in terms of weeks would substantially distort the results. Instead, Kissler, et al. interpolate the weekly incidence proxies to daily values. This is done by a procedure described in the supplementary material of a paper by te Beest, et al., as follows:

- The weekly incidence proxy for a virus is used to produce a weekly proxy for cumulative incidence.
- An interpolating spline is fit to the weekly cumulative incidence proxy (passing exactly through the data points).
- The values of this spline are evaluated at daily intervals, producing a daily proxy for cumulative incidence.
- Differences of successive values of this daily cumulative incidence proxy are taken, producing a proxy for daily incidence.

And advantage of this procedure emphasized by te Beest, et al. is that the total incidence in the daily series exactly matches the total incidence in the original weekly series. When the weekly incidence series is a noisy and likely biased measure of true incidence, this does not seem to be a compelling justification.

A disadvantage of this procedure is that it can produce negative values for daily incidence. When applied to the proxies used by Kissler, et al. (what I call proxyW in Part 2), there are 6 negative values for NL63, 30 for 229E, none for OC43, and 89 for HKU1. Kissler, et al. replace these negative values with 10^{-7}.

I’ve looked at how the other proxies I discuss in Part 2 fare in this regard. Fixing the holiday anomalies only slight improves matters — proxyWX, for example, which has a fudge for holidays, has only one fewer negative daily values than proxyW. Adjusting for outliers and zeros in the fraction of positive tests for each coronavirus has a bigger effect — proxyWXo, for example, has 2 negative values for NL63, 9 for 229E, none for OC43, and 38 for HKU1.

Negative daily incidence values are completely eliminated when the fractions of positive coronavirus tests that are for each virus are smoothed, as described in Part 2 (proxyWs and proxyAXn, for example, have no negative values).

Negative values can also be reduced by applying a filter to smooth the daily values after step 4 above. I use a filter with impulse response (0.04,0.07,0.10,0.17,0.24,0.17,0.10,0.07,0.04). After this filtering, proxyWXo has only 14 negative values (all for HKU1).

Note that it may be desirable to smooth at either or both of these stages beyond the amount needed to just eliminate negative daily incidence values, since values only slightly greater than zero may also be undesirable. I’ll show below what the effects of this smoothing are on estimates for *R*.

The formulas from above — *R _{t} = b(t) / ∑_{a} b(t-a)g(a)* and

The serial interval distribution is the distribution of the time from the onset of symptoms of one infected person to the onset of symptoms for another person whom they infect. If the incubation period (time from infection to developing symptoms) were the same for all people, the serial interval distribution would be the same as the generative interval distribution — the delay from infection to symptoms for the first person would be cancelled by the same delay from infection to symptoms for the person they infect. Since the incubation period is actually at least somewhat variable, the serial interval distribution will typically have a higher variance than the generative interval distribution, although their means will be the same. (See this paper for a more detailed discussion.)

In the Kissler, et al paper, using a serial interval distribution as they do is actually more appropriate than using a generative interval distribution, since the incidence proxies are not proxies for numbers of people infected each week, but rather for the number who come to the attention of the health care system each week. If people seek medical care a fixed time after developing symptoms, the serial interval distribution would be appropriate to use. (Of course, there is actually some variation in how long people wait before seeking medical care.)

The serial interval distribution used by Kissler, et al. is derived from a Weibull distribution with shape parameter 2.35 and scale parameter 9.48 (which has mean 8.4). This is a bit funny, since Weibull distributions are continuous, but a discrete distribution is needed here. Kissler, et al. just evaluate the Weibull probability density at integer points, up to the integer where the right tail probability is less than 0.01. This procedure will not generally produce a discrete probability distribution in which the probabilities sum to one. In this case, the probabilities sum to 0.995, so the error is small (it will lead to *R* being overestimated by about 0.5%).

The estimated SARS serial interval distribution used is from this paper by Lipsitch, et al. Here is the Weibull fit together with the count data it is based on:

The Weibull model seems fairly reasonable, except that it perhaps gives intervals of one or two days too high a probability. There is, however, no real reason to use a distribution having a simple parametric form like the Weibull — a better match with reality would probably result from just estimating these 20 probabilities, while incorporating some sort of smoothness preference, as well as whatever other biological knowledge might be available to supplement the data.

Using the daily incidence values interpolated from weekly numbers, *b(t)*, and the chosen generative/serial interval distribution, *g(a)*, Kissler, et al. obtain daily estimates for *R _{t}* and

Here are the smoothed weekly values for *R _{u}* found using proxyW, the proxy used by Kissler, et al. (click for larger version):

One can see some variability that looks like it must be noise, especially for HKU1. Here are the values for *R _{u}* without smoothing:

The red points represent values that are beyond the scale of the plot.

There’s quite a bit more noise in the plots of *R _{u}* without smoothing, which may be why Kissler, et al. thought it would be beneficial to model the smoothed estimates of

I argued above that it would be better to model *R _{t}* than

There’s a huge amount of noise here. I’ll show in the next post that fitting a model using these *R _{t}* estimates produces ridiculous results. So if Kissler, et al. ever considered modelling

Recall from above that *R _{u}* is estimated from the equation

The real problem, however, is that proxyW, used by Kissler, et al., has multiple problems, as discussed in Part 2 of this series. Better *R _{t}* estimates can be found using proxyWXss, which fixes the problem with holidays, fixes outliers in the coronavirus data, replaces zeros in the coronavirus data with suitable non-zero values, and smooths the percentages of positive tests for each virus amongst all positive coronavirus tests. Here are these

These seem much more usable. Of course, the estimates for *R _{u}* and their smoothed versions also have less noise when proxyWXss is used rather than proxyW.

Using some of the other proxies discussed in Part 2 gives estimates with even less noise. Here are the *R _{t}* estimates found using proxyDn, with filtering applied to the daily proxies:

After writing Part 2, it occurred to me that by replacing the ILI proxy by its spline fit, one can create a proxy, which I call proxyEn, that is entirely based on spline fits. Such a proxy can directly produce daily values, without any need for the interpolation procedure above. Here are the *R _{t}* estimates produced using this proxy:

In my next post, I’ll discuss the regression model that Kissler, et al. fit to their estimates of *R*, especially for the two betacoronaviruses that are related to SARS-CoV-2. This model has terms for immunity, cross-immunity, and seasonal effects. I will discuss problems with this model, and with how the authors interpret the results from fitting it, and look at alternatives. In subsequent posts, I’ll discuss the later parts of the paper, which model the incidence proxies in a different way using a SEIRS model, which is then extended to include SARS-CoV-2. This SEIRS model is the basis for their results that are intended to provide some policy guidance.

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period, Science, vol. 368, pp. 860-868, 22 May 2020 (released online 14 April 2020). The paper is also available here, with supplemental materials here.

In this post, I’ll start to examine in detail the first part of the paper, where the authors look at past incidence of “common cold” coronaviruses, estimate the viruses’ reproduction numbers (R) over time, and use those estimates to model immunity and cross-immunity for these viruses, and seasonal effects on their transmission. The results of this part inform the later parts of the paper, in which they model the two common cold betacoronaviruses together with SARS-CoV-2 (the virus for COVID-19), and look at various scenarios for the future, varying the duration of immunity for SARS-CoV-2, the degree of cross-immunity of SARS-CoV-2 and common cold betacoronaviruses, and the effect of season on SARS-CoV-2 transmission.

In my previous post, I used the partial code released by the authors to try to reproduce the results in the first part of the paper. I was eventually able to do this. For this and future posts, however, I will use my own code, with which I can also replicate the paper’s results. This code allows me to more easily produce plots to help understand issues with the methods, and to try out alternative methods. The code (written in R) is available here, with GPLv2 licence. The data used is also included in this repository.

In this second post of the series, I examine how Kissler et al. produce proxies for the incidence of infection in the United States by the four common cold coronaviruses. I’ll look at some problems with their method, and propose small changes to try to fix them. I’ll also try out some more elaborate alternatives that may work better.

The coronavirus proxies are the empirical basis for the remainder of paper.

The paper constructs these coronavirus proxies in two stages. First, a proxy is constructed for the incidence of “Influenza-Like Illness” (ILI), which may be actual influenza or some other infection causing similar symptoms. Second, this proxy is multiplied by the percent of tests on patients with ILI that were positive for each of the common cold coronaviruses, to produce proxies for incidence of each of these coronaviruses.

The data on Influenza-Like Illness (ILI) in the U.S. comes from the CDC FluView site. A proxy for ILI can be obtained by combining data on the total number of patient visits to participating providers with the number of these visits that are for ILI. The FluView site has the following warning:

The number and percent of patients presenting with ILI each week will vary by region and season due to many factors, including having different provider type mixes (children present with higher rates of ILI than adults, and therefore regions with a higher percentage of pediatric practices will have higher numbers of cases). Therefore it is not appropriate to compare the magnitude of the percent of visits due to ILI between regions and seasons.

We can’t let this warning stop us from doing anything, but it may be worth keeping in mind. (Kissler, et al. don’t mention this warning.)

The FluView data is by week, and contains the number of providers that week, the total number of patient visits, the total visits for ILI, and an age breakdown for ILI visits. It also contains the percentage of visits that are for ILI, which is simply the number of ILI visits divided by the total number of visits, and a “weighted percentage” of visits that are for ILI, which “is calculated by combining state-specific data weighted by state population” (see here).

The Kissler et al. paper simply uses this weighted percentage of ILI visits as their proxy for ILI. They refer to a paper by Goldstein et al. for discussion of the conditions under which this proxy will be proportional to the actual incidence of ILI, and when multiplied by the percentage of tests on ILI patients that are positive for a particular virus, will give a valid proxy for incidence of that virus:

This proxy would be a perfect measure (up to a multiplicative constant) of the incidence of infection with a particular strain if the following conditions were met: (i) the fraction of ILIs that result in medical consultations, while possibly varying by strain, does not vary by year or by week; (ii) the numbers of physician visits for all causes in the sentinel practices were consistent from week to week during the season; (iii) the sentinel practices were representative of the full population; and (iv) viral testing, which is conducted in a separate system, was performed on isolates representative of those obtained from patients consulting the sentinel practices for ILI.

One cannot, of course, expect the conditions above to be exactly satisfied. Goldstein et al. remark that “Nonetheless, we take this proxy measure as the best relative measure of influenza strain-specific incidence that can be calculated from surveillance data”. (The Goldstein et al. paper deals with influenza strains rather than coronaviruses, but the technique used is the same.)

In the Kissler et al. paper, condition (iii) it is not actually necessary — it would be sufficient for the proxies to be representative of some stable and reasonably diverse portion of the U.S. population, since that would be equally informative regarding the characteristics of the virus. So it is not clear that the weighted percentage used by Kissler et al., which I’ll call proxyW, is superior to the unweighted percentage, which I’ll call proxyU. It could be worse, if the weighting procedure produces a greater violation of condition (iv).

Condition (ii) is an unreasonable assumption even in an idealized world without all the other complications that exist in reality (such as holidays, as discussed below). For it to be true in conjunction with condition (i), it would be necessary for visits to physicians for *other* reasons to go down whenever visits for ILI go up. This could only be true if physician visits are rationed to a fixed number per week, and visits for ILI are prioritized over visits for other reasons. This seems quite unlike how the U.S. health care system operates. And in the data, the total number of physician visits per week in fact varies substantially.

A less unreasonable assumption would be that the number of non-ILI physician visits per week is constant. Then the appropriate ILI proxy — which I’ll call proxyA — would be the ratio of ILI visits to non-ILI visits. Of course, the number of non-ILI visits is not really constant, but this proxy could still be good if the variation in number of non-ILI visits is due to reporting issues, such as providers commencing or ceasing data reports to the CDC, and these issues affect ILI visits in proportion. Also, if the number of non-ILI visits varies for other reasons, perhaps these reasons are unrelated to ILI visits, and so just add unbiased noise to the measurements.

In the actual data, the percentage of visits due to ILI is never greater than about 8%, and consequently the difference between proxyU and proxyA is not large. But use of proxyU (or proxyW) will tend to diminish the size of peaks of incidence — for instance, an increase in proxyU from 1% to 8%, a factor of 8, corresponds to an increase in proxyA from 1/99=1.01% to 8/92=8.70%, a factor of 8.6.

ProxyW, proxyU, and proxyA all have other problems, one of which can be seen for proxyW by plotting its value over the time period used by Kissler et al. (click for larger version):

The red vertical lines indicate the start of CDC’s “flu season” (week 40); the green lines indicate its end (week 20 of the next year). The grey vertical lines are just before the weeks of the 4th of July, Labor Day, Thanksgiving, Christmas, and New Year’s Day holidays (note that CDC weeks start on Sunday). One can see that the weeks of Thanksgiving, Christmas, and New Year’s Day (the last three holidays of a year) have systematically higher values for proxyW (the same is true for proxyU and proxyA). This is most clearly seen in the plot at the end of 2015, but is true for all years (with one or two unclear cases).

Since the values immediately before and after these holidays are lower (noting that Christmas and New Year’s Day are always in consecutive weeks), it is clear that these jumps are not due to actual high levels of ILI during these holiday weeks. This may seem counter-intuitive — do people really decide to spend their holidays visiting their doctor about an illness that they would not have bothered getting attention for if they weren’t on holiday? They don’t. A log plot of ILI visits shows that there is actually a tendency towards fewer ILI visits in holiday weeks, especially for Thanksgiving (click for larger version):

The reason for the higher proxy values in holiday weeks is that those weeks have an even greater decrease in non-ILI visits:

The ranges for the vertical scale in these logarithmic plots are the same, so one can see that the holiday decreases in non-ILI visits are larger than the decreases in ILI visits — perhaps people don’t visit doctors for routine reasons when on holiday, but are more inclined to do so when actually sick. This is the reason for the spurious increase in the percentage of visits due to ILI on holidays.

I’ve tried correcting the holiday problem with a simple fudge — replace each holiday value with the linear interpolation from the two adjacent non-holiday values. I call the modified proxies proxyWX, proxyUX, and proxyAX. Here is how this fudge changes the values for proxyA (click for larger version):

The pink dots above are the logarithms of the proxyA values; the black lines from them go to the corresponding values for proxyAX. This fudge seems to have improved the situation with Thanksgiving, Christmas, and New Year holidays, though one might wonder whether the peak at Christmas 2014 has been flattened too much.

There could be other problems, however. A more general framework could view the number of reported cases of ILI in some week as being the result of three factors:

- The actual incidence of ILI that week.
- The tendency in that week for people with ILI to visit a physician.
- The availability to people with ILI that week of physicians who report ILI to the CDC.

To infer (1) from the number of reported cases of ILI, we need first to somehow infer (2) and (3), or at least their combined effect.

ProxyA (the ratio of ILI visits to non-ILI visits) is based on two assumptions. First, that (2) is the same for all weeks. Second, that (3) can be estimated by the number of non-ILI visits — that is, that the number of non-ILI visits can be taken as indicative of the availability to potential ILI patients of physicians who report to the CDC, or to put it another way, that the number of non-ILI visits is proportional to the probability that a patient who would like to visit a doctor due to ILI will visit one of the physicians who reports visits to the CDC.

We see that on holidays this second assumption is not correct, since the number of non-ILI visits drops substantially, but providers who report to the CDC seem to be as available (or nearly as available) to patients with ILI as on other days. It’s possible that on holidays there is a small decrease in the availability for ILI patients of providers who report to the CDC. This cannot easily be distinguished from a small decrease those weeks in the tendency of people with ILI to visit a doctor. But distinguishing seems unnecessary as long as the total effect can be estimated, which is what proxyAX tries to do, in a crude way.

A less crude method, which might also be able to correct problems other than holidays, is to build a model for ILI visits whose components can be identified (under some assumptions) as indicative of actual ILI or not. For this to work well, a model for non-ILI visits may also be needed, some of whose components can be identified (under more assumptions) as being indicative of the degree of availability of providers reporting to the CDC.

I’ll start with a model for non-ILI visits. A major component of this model will be the number of providers who report to the CDC each week, the log of which is plotted below:

Most notable is the sudden jump at the start of CDC’s official “flu season” (red line, week 40 of the year). Presumably, more providers are recruited at this time. Puzzlingly, although there is a corresponding drop at the end of the flu season (green line, week 20), it is much smaller. Providers recruited at the start of the flu season seem to drop out gradually. Perhaps many just get tired of the whole thing after a while (but if that’s the reason, it’s odd that apparently they all start doing it again next flu season). A more troubling possibility is that providers drop out because they have seen little or no ILI, which would bias the results, in the direction of ILI seeming to decline more slowly than it actually does.

In 2014 and 2017, there’s also a small jump after Labor Day. I can’t think of a reason for this, unless some providers are just eager to start flu season reporting.

These jumps in the log of the number of providers correspond to jumps in the log of the number of ILI and non-ILI visits, but the jumps in visits are smaller.

As a first attempt at modelling the log of the number of non-ILI visits, I fit a regression model with the log of the number of providers as a covariate, along with a spline with nine degrees of freedom to capture the trend over the five-year time period. Here is the fit (in blue) along with the data points:

Here are the residuals for this model:

The effect of holidays (4th of July, Thanksgiving, Christmas, and New Year’s Day) are clearly visible in the residual plot. The next model adds indicators for these holidays as covariates. They are 1 the week of the holiday and 0 otherwise (except that the Christmas indicator is more complicated, being split between early years and later years, and being partially active for the week before when Christmas is early in the week):

Here are the residuals for this model:

One can see a strong seasonal effect in which the number of visits grows in September, just before the official start of flu season (marked by the red line). Perhaps this relates to the end of summer vacations, or the start of school. There’s then an abrupt drop immediately after the start of flu season, presumably because the mix of providers suddenly changes. Other seasonal patterns also seem to be present.

To model a seasonal effect, I added covariates of the form sin(2πfy) and cos(2πfy), where y is a continuous indicator of years since the start of the period, and f is 1 or 2. These covariates are the initial terms of a Fourier series representation of a periodic function. I also added the week within the flu season and its square as covariates, allowing for the abrupt change when flu season starts.

Here is the fit of this model:

Here are its residuals:

This seems to be about as good a model as we can expect to get for non-ILI visits.

One way we might try to use this model is to use only its intercept, spline, and log(providers) components to predict non-ILI visits, and take the ratio of ILI visits to these predictions for non-ILI visits as a proxy for ILI incidence. I call this proxyB. It will have less noise than proxyA because it omits residuals in the predictions for non-ILI visits. It handles the holiday problem fairly well by omitting the holiday component in the non-ILI predictions. By omitting the seasonal component in the predictions, it avoids mis-characterizing seasonal variation in non-ILI visits as seasonal variation in incidence of ILI.

However, proxyB has some serious problems. Here is how it compares to proxyAX (click for larger version):

ProxyB systematically moves points (compared to proxyAX) in a way that introduces a discontinuity where flu season starts (the red line). It also seems to be doing something wrong in January of 2016.

We might hope to do better using a model for the log of the number of ILI visits. As covariates, we can again use a spline, the log of the number of providers, and the indicators of holidays. However, I’ll use a spline with more knots (one at the start of each month), which can model seasonal effects as well as long-term trends. (Unlike for non-ILI visits, the seasonality for ILI visits is not firmly fixed to the calendar — the time of the peak varies in different years by several weeks — so a Fourier representation of seasonality that is common to all years is not appropriate.)

Here is the fit for this model of log ILI visits:

Here are its residuals:

This residual plot doesn’t reveal any serious problems with the model (a little heteroskedasticity, but we can tolerate that). To use it to produce an ILI proxy, we need to decide what components of the model to retain. The variation due to holidays seems like it can’t be indicative of real change in ILI. The variation due to change in the number of providers also seems irrelevant. So we’re left with the spline fit and the residuals:

I call this proxyC (after applying an ad hoc overall scaling factor to get it to roughly match the scale of other proxies). Here’s how it compares to proxyAX (click for larger version):

If one looks at the time at and just after Christmas / New Year’s Day, ProxyC seems to do better than proxyAX in 2016/2017, 2017/2018, and 2018/2019, but worse in 2014/2015 and 2015/2016, assuming that curves that are smoother here are more likely to be correct.

One can also see a general tendency for proxyC to lower the minimum incidence in summers before 2018, and to raise the peak incidence in winter in 2018 and 2019.

Finally, here’s the fit of another model for log of the number of ILI visits, in which the residual from the final model above for the log of the number of non-ILI visits is included as a covariate (in addition to those used in the ILI visits model above):

This doesn’t look much different from the fit for the first model of ILI visits, but there is a significant improvement, with the estimated standard deviation of the residuals declining from 0.05106 to 0.04703. The regression coefficient for the non-ILI model residuals in the ILI model is 0.809, so most the the residual variation in non-ILI visits seems to be indicative of accessibility of providers to potential ILI patients, rather than variation in demand for non-ILI medical services.

I use the spline fit and residuals from this model to create another ILI incidence proxy, which I call proxyD (after again applying an overall scale factor to make it roughly comparable to the other proxies). Here’s how proxyD compares to proxyC (click for larger version):

The changes in proxyD are small, but seem to be desirable, assuming that lesser discontinuity is more likely to be correct.

ProxyD seems to be the best of these proxies, with proxyAX being a much simpler alternative that’s worth comparing against. To conclude this section, here is how proxyW, used by Kissler et al., compares to proxyD:

One can see the holiday problem with proxyW. There’s also a general tendency for proxyW to be lower than proxyD in 2018 and 2019. This seems related to the rise in the number of providers reporting to the CDC those years, which went along with a rise in the number of non-ILI visits *per provider* in those years, and a decrease in the number of ILI visits per provider (at least in some periods). The warning from the FluView site quoted above may be relevant here — the additional providers reporting to the CDC may serve a population that makes relatively more non-ILI visits.

The Kissler et al. paper uses their ILI proxy to create proxies for the four common cold coronaviruses — NL63, 229E, OC43, and HKU1. [ Note: The R code uses the name E229 instead of 229E, since the latter is not a valid R identifier. ]

This is done very simply in the paper — their proxy for incidence of a virus is just their proxy for ILI (proxyW, weighted percent ILI among all visits) times the percent of tests on ILI samples that were positive for that virus.

Here’s what their proxies look like (click for larger version):

These plots don’t look too bad. But look at these proxies on a log scale:

OC43 still looks OK, but the others, especially HKU1, have a lot of noise in the smaller values. Some of the values are zero, which are shown above as dots at the bottom. Note that it is the logarithmic plot that is relevant to assessing how variability will affect the later estimation of R values.

After some investigation, I decided that one thing that needs to be done to get good results is to adjust a few outliers. There is no information available about what is responsible for these extreme values. They could be testing or reporting anomalies, or they could be correct but unrepresentative (eg, due to positive tests for all people in one large family).

Here are plots of the log of the percent positive tests for each virus, with points I regard as outliers in red, and with lines to where I moved them, with zeros again plotted at the bottom (click for larger version):

The outliers were replaced by the average of the four nearest points, except that for the NL63 outlier, which is zero, this average was divided by 1.3, to reflect the possibility that the zero value is not entirely spurious.

I also replaced all the zero values with the minimum non-zero value for the series, divided by 1.3 to reflect that while a zero measurement cannot represent an actual zero incidence, it is evidence that the incidence then was lower than the non-zero measurements.

Proxies based on the percentages after these adjustments for outliers and zeros have an “o” appended to their name. Here, for example, is log proxyAXo for the four viruses (click for larger version):

These proxies do not have extreme points, but are still rather noisy. As I’ll discuss in a later post, Kissler, et al. handle noise by averaging their final estimates for R over three weeks. It seems better to instead reduce noise at this stage.

The approach I’ve taken is to re-express the proxy for a virus as the product of the ILI proxy, the *total* of positive test percentages for all four viruses, and the ratio of percent positive tests for that virus to the total. If no smoothing is done, this is the same as the product of the ILI proxy and the percent positive tests for that virus. But the effect of smoothing the total and/or the ratios to the total is different from smoothing the percentages of positive tests. (Outliers and zeros are adjusted before these procedures.)

I create two new sets of proxies by smoothing the ratios of percent positive for each virus to the total percent positive, done by applying a filter with impulse response (0.2,0.6,0.2) either once or twice. I name these proxies by appending “s” or “ss” to the name of the ILI proxy. Here is log proxyAXss:

I’ve also created proxies based on smoothing with splines. Here are the spline fits for the logits of ratios of percent positive to total percent positive (click for larger version):

Here is the spline fit for the log of the total percent positive:

The rationale for this approach, smoothing the total and the ratios separately, is that the total captures a good chunk of the variability but shows a more regular pattern, with less noise, than the individual percent positive series. So modelling the total separately could be beneficial. I haven’t done any extensive tests to validate this intuition, however.

I used these spline fits to create two new versions of the coronavirus proxies. The “m” versions use the spline fit for the ratios but the actual values for the totals. The “n” versions use the spline fits for both. Here is proxyDn:

Noise has certainly been reduced. One might of course wonder whether some real features have been smoothed out of existence.

In my next post, I’ll discuss how the coronavirus incidence proxies are used in the paper to produce estimates for the reproduction numbers (R) of these viruses, for each week from mid-2014 to mid-2019. This will be followed by a post discussing how the paper models these R values, in terms of immunity and seasonal effects. As above, I will discuss problems with the methods in the paper, and alternatives that might be better.

After those posts, I’ll go on to discussing the later parts of the paper, which are intended to provide some policy guidance.

]]>I’ve been looking at the following paper, by researchers at Harvard’s school of public health, which was recently published in *Science*:

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch (2020) Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period (also available here, with supplemental materials here).

This is one of the papers referenced in my recent post on seasonality of COVID-19. The paper does several things that seem interesting:

- It looks at past incidence of “common cold” coronaviruses, estimating the viruses’ reproduction numbers (R) over time, and from that their degrees of cross-immunity and the seasonal effect on their transmission.
- It fits an ODE model for the two common cold betacoronaviruses, which are related to SARS-CoV-2 (the virus for COVID-19), using the same data.
- It then adds SARS-CoV-2 to this ODE model, and looks at various scenarios for the future, varying the duration of immunity for SARS-CoV-2, the degree of cross-immunity of SARS-CoV-2 and common cold betacoronaviruses, and the effect of season on SARS-CoV-2 transmission.

In future posts, I’ll discuss the substance of these contributions. In this post, I’ll talk about my efforts at reproducing the results in the paper from the code and data available, which is a prerequisite for examining why the results are as they are, and for looking at how the methods used might be improved.

I’ll also talk about an amusing / horrifying aspect of the R code used, which I encountered along the way, about CDC data sharing policy, and about the authors’ choices regarding some graphical presentations.

The authors released some of the code and data used in the paper in three github repositories:

- https://github.com/c2-d2/CoV-seasonality
- https://github.com/skissler/nCoV_introduction
- https://github.com/c2-d2/postpandemic (however, this may pertain only to an earlier version of the paper)

These repositories correspond roughly (but incompletely) to the three parts of the paper listed above. I’ll talk about reproducing the results of each part in turn.

The paper uses data from the CDC to estimate how the reproduction number, R, for the four “common cold” coronaviruses has changed over time (from Fall 2014 to Spring 2019, in the US), and uses these estimates for R to estimate the impact of seasonality on transmission for these coronaviruses, the degree of immunity that develops to them, and the degree of cross-immunity between the two betacoronaviruses (HKU1 and OC43). Since SARS-CoV-2 is also a betacoronavirus, one might expect it to behave at least somewhat similarly to the two common cold betacoronaviruses, and for there to perhaps be some cross-immunity between SARS-CoV-2 and the other betacoronaviruses.

The procedure used in the paper to estimate R over time for each of these viruses has several steps:

- The incidence of infection each week with the common cold coronaviruses was estimated (up to an unknown scaling factor, relating to how likely sick people are to visit a doctor) by multiplying the weekly reports of physician visits for Influenza-Like Illness (ILI) by the weekly percentage of laboratory tests for the four coronaviruses that were positive for each of them.
- A spline-based procedure was used to interpolate daily incidence of each virus from these weekly numbers .
- From these daily incidence numbers, estimates of R for each day were obtained using a formula that looks at the incidence that day and the previous 19 days.
- The daily estimates for R were used to produce weekly estimates for R, by taking the geometric mean of the 21 daily values for the week in question and the previous and following weeks.

The first problem with reproducing these estimates is that although the data on physician visits for ILI is available from here (and included in the first repository above), the CDC allows access to only the *last two years* of data on positive tests for common cold coronaviruses (from here). According to the README for the repository, “Full data used in paper is available through a data use agreement with the CDC”.

This sort of bullshit makes one wonder about the mentality of the people running the CDC. There is obviously *no reason whatever* for keeping this data under wraps. Patient confidentiality can’t be an issue, both due to the nature of the data, and to the fact that they do make it public for the last two years. Nor can it be a matter of minimizing work on the part of the CDC — it must take extra effort to keep removing older data so that only two years are available, not to mention the effort of processing data use agreements.

This CDC policy certainly resulted in extra work for the authors of this paper. They included the last two years of publicly-available data in the first repository above, along with R code that had been modified to work with only two years of data rather than five years. The results produced are of course not the same as in the paper.

Fortunately, the second repository above has a data file that in fact includes the full data that was omitted from the first repository. The data can be reformatted to the required form as follows:

dfi <- read.csv("../nCoV_introduction-master/nrevssCDC_ILI.csv",head=TRUE) dfo < as.data.frame( list(RepWeekDate=as.character(as.Date(dfi$WEEKEND),"%m/%d/%y"), CoVHKU1=round(100*dfi$HKU1,7), CoVNL63=round(100*dfi$NL63,7), CoVOC43=round(100*dfi$OC43,7), CoV229E=round(100*dfi$E229,7))) write.table (dfo, "full-Corona4PP_Nat.csv", sep=",", row.names=FALSE, col.names=TRUE, quote=FALSE)

Now the remaining task is to modify the supplied R code in the first repository so it works with the full five years of data. Here are the crucial diffs needed to do this:

-# Data below shared by NREVSS team -df.us_cov_national <- read.csv("Corona4PP_Nat.csv") #2018-03-10 through 2020-02-29 +# Reconstruction of full dataset used in paper. +# Data is for 2014-07-05 through 2019-06-29. +df.us_cov_national <- read.csv("full-Corona4PP_Nat.csv") - Week_start < "2018-07-01" ~ 0, # First season (and only complete season) in this dataset is 2018-19 - (Week_start >= "2018-07-01") & (Week_start < "2019-07-01") ~ 1, - (Week_start >= "2019-07-01") & (Week_start < "2020-07-01") ~ 2)) # 2018-2019 is the last season in our data + Week_start < "2014-07-06" ~ 0, # Before first season + Week_start < "2015-07-05" ~ 1, + Week_start < "2016-07-03" ~ 2, + Week_start < "2017-07-02" ~ 3, + Week_start < "2018-07-01" ~ 4, + Week_start < "2019-06-30" ~ 5, # 2018-2019 is the last season, last data is for 2019-06-29 + TRUE ~ 0)) # after last season -for(s in 1:2){ - temp.df <- df.us_all_national_withR %>% filter(season==s, epi_week>=season_start | epi_week<=season_end) +for(s in 1:5){ + temp.df <- df.us_all_national_withR %>% filter(season==s, epi_week>=season_start | epi_week<=(season_end-(s==1))) # -(s==1) to fudge for 53 weeks in 2014 - season==1 ~ "2018-19", - season==2 ~ "2019-20")) %>% - mutate(season=factor(season, levels=c("1", "2"))) #Set season 1 as reference group in regression -# Note: with this limited dataset, season 2 is incomplete. Full dataset has 5 complete seasons. + season==1 ~ "2014-15", + season==2 ~ "2015-16", + season==3 ~ "2016-17", + season==4 ~ "2017-18", + season==5 ~ "2018-19")) %>% + mutate(season=factor(season, levels=c("1", "2", "3", "4", "5"))) #Set season 1 as reference group in regression

I also added code to produce various plots and other output, some corresponding to plots in the paper or supplemental information, and some for my use in figuring out what the code does. The original code doesn’t come with an open-source license, so I won’t post my full modified source file, but some of the code that I added at the end is here, and some of the plots that it produced are here and here.

I will, however, talk about one little snippet of the original program, whose behaviour is… interesting:

RDaily <- numeric() for(u in 1:(length(week_list)*7)){ #Estimate for each day sumt <- 0 for(t in u:(u+stop)){ #Look ahead starting at day u through (u+max SI) suma <- 0 for(a in 0:(stop)){ #Calc denominator, from day t back through (t-max SI) suma = daily_inc[t-a,v]*func.SI_pull(a, serial_int) + suma } sumt = (daily_inc[t,v]*func.SI_pull(t-u, serial_int))/suma + sumt } RDaily[u] = sumt }

This code computes daily estimates for R (putting them in `RDaily`), using the following formula from the supplemental information:

Notice that the loop for `u` starts at 1, the loop for `t` inside that starts at `u`, and the loop for `a` inside that starts at 0, and goes up to `stop` (*i _{max}* in the formula), whose value is 19. For the first access to

Well, R programmers will know that negative indexes are allowed, and extract all items *except* those identified by the negative subscript. So `daily_inc[-1,v]` will create a long vector (1819 numbers) without error. It seems like an error should arise later, however, when this results in an attempt to store 1819 numbers into `RDaily[u]`, which has space for only one.

But crucially, before a negative index gets used, there’s an attempt to access `daily_inc[0,v]`. R programmers may also know that using a zero index is not an error in R, even though R vectors are indexed starting at 1 — zero indexes are just ignored. (I’ve previously written about why this is a bad idea.) When the subscript is a single zero index, ignoring it results in extraction of a zero-length vector.

Now, zero-length vectors also seem like the sort of thing that would lead to some sort of error later on. But R is happy (for good reason) to multiply a zero-length vector by a scalar, with the result being another zero-length vector. The same is true for addition, so when `t-a` is 0, the effect is that `suma` in the innermost loop is set to a zero-length vector. (This is not the same as 0, which is what it was initialized to!)

Only *after* `suma` has been set to a zero-length vector does it get multiplied by a vector of length 1891, from accessing `daily_inc[-1,v]`. R is also happy to multiply a zero-length vector by a vector of length greater than one (though this is rather dubious), with the result being a zero-length vector. So `suma` stays a zero-length vector for the rest of the inner loop, as `daily_inc` is accessed with indexes of -1, -2, …, -18. After this loop completes, `suma` is used to compute a term to add to `sumt`, with R’s treatment of arithmetic on zero-length vectors resulting in `sumt` being set to a zero-length vector, and remaining a zero-length vector even when `t` becomes large enough that accesses with indexes less than one are no longer done.

But it still seems we should get an error! After the loop over `t` that computes an estimate for R at time `u`, this estimate is stored with the assignment `RDaily[u]=sumt`. Since `sumt` is a zero-length vector, we’d expect an error — we get one with code like `x=c(10,20);x[2]=numeric()` for example (note that `numeric()`) creates a zero-length numeric vector). Now, the code is actually extending `RDaily`, rather than replacing an existing element, but that doesn’t explain the lack of an error, since code like `x=c(10,20);x[3]=numeric()` also gives an error.

The final crucial point is that all these “out-of-bounds” accesses occur at the *beginning* of the procedure, when `RDaily` is itself a zero-length vector. For no clear reason, R does *not* signal an error for code like `x=numeric();x[3]=numeric()`, but simply leaves `x` as a zero-length vector. And so it is in this code, with the result that `RDaily` is still a zero-length vector after all operations with zero and negative out-of-bounds accesses have been done. At that point, when `u` is 20, a sensible value for R will be computed, and stored in `RDaily[20]`. R will automatically extend `RDaily` from length zero to length 20, with the first 19 values set to `NA`, and later computations will proceed as expected.

So in the end, the result computed is sensible, with the estimates for R on days for which data on 19 previous days is not available being set to `NA`, albeit by a mechanism that I’m pretty sure was not envisioned by the programmer. Later on, there are also out-of-bounds accesses past the end of the vector, which also result in `NA` values rather than errors. All these out-of-bounds references can be avoided by changing the loop over `u` as follows:

for (u in (stop+1):(length(week_list)*7-stop)) { #Estimate for each day

The code produces estimates of R for each week of the cold season for all four coronaviruses, but attention focuses mainly on the two betacoronaviruses, HKU1 and OC43. A regression model is built for the R values of these viruses in terms of a seasonal effect (modelled as a spline, common to both viruses) and the effects of immunity from exposure to the same virus and of cross-immunity from exposure to the other of the two betacoronaviruses (four coefficients). The immunity effects can only be estimated up to some unknown scaling factor, with the assumption that the sum of weekly incidence numbers up to some point in the season is proportional to the fraction of the population who have been exposed to that virus.

The results I get match the regression coefficients in Table S1 of the paper’s supplemental information, and some additional plots are also as expected given results in the paper.

The seasonal and immunity effects over time are summarized in Figure 1 of the paper. Here is the part of that figure pertaining to HKU1 and the 2015-2016 cold season:

The orange curve shows the estimated multiplicative seasonal effect on R (horizonal dots are at one), the red curve is the estimated effect on R from immunity to HKU1, and the blue curve is the estimated effect from cross-immunity to OC43.

Here is my reproduction of this figure (without attempting to reproduce the error bands):

This seems to perfectly match the plot in the paper, except that the plot in the paper shows only 30 weeks, whereas the model is fit to data for 33 weeks, which is also time span of the spline used to model the seasonal effect. As one can see in my plot, after week 30 (at the bold vertical bar), the modelled seasonal effect on R rises substantially. But this feature of the model fit is not visible in the figures in the paper.

Researchers at Harvard really ought to know that they should not to do this. The rise after week 30 that is not shown in their plots is contrary to the expectation that R will decrease in summer, and is an indication that their modelling procedure may not be good. In particular, after seeing this rise at the end of the season, one might wonder whether the sharp rise in the seasonal effect on R seen at the beginning of the season is actually real, or is instead just an artifact of their spline model.

The second major topic of the paper is the fitting of an ODE (Ordinary Differential Equation) model for the incidence of the two common cold betacoronavirues. The data used is the same as for the first part of the paper, but rather than directly estimate R at each time point, an underlying model of the susceptible-exposed-infected-recovered-susceptible (SEIRS) type is used, from which incidence numbers can be derived, and compared to the data.

According the paper and supplemental information, the parameters of the SEIRS model (eg, the degree of seasonal variation, and the rate at which immunity wanes) were fit by a procedure combining latin hypercube sampling (implemented in R) and Nelder-Mead optimization (implemented in Mathematica). The code for these procedures has not been released, however. Hence reproducing this part of the paper is not possible.

The second repository above does contain code to run the SEIRS model, with parameters set to values that are fixed in the code (presumably to the values found by the optimization procedure that they ran).

This SEIRS model produces values for R for each virus and time point, which can be compared to the estimates from the first part of the paper. To do this, the R code for this part of the paper needs to read the estimates for R produced by the R code for the first part. These estimates can be written out as follows:

rmv <- c(1:3,nrow(Reff.CoV_ili_x_pos_pct_SARS):(nrow(Reff.CoV_ili_x_pos_pct_SARS)-2)) write.table (Reff.CoV_ili_x_pos_pct_SARS[-rmv,], "R_ili_x_pos_pct_SARS.csv", row.names=FALSE, col.names=TRUE, quote=FALSE, sep=",")

My modified version of the `figuremaker.R` R source file provided in the second repository above is here. It has small modifications to read the data as written out above, and to enable production of plots.

One of the plots produced by running this code is an exact reproduction of Figure 2A in the paper:

This plot shows the actual and simulated incidence of the two common cold betacoronaviruses over five cold seasons.

Running the code also produces what should be reproductions of Figures 2B and 2C, in which the values for R produced by the best-fit SEIRS model (the curve) are compared to the weekly estimates for R from the first part of the paper. But these reconstructions do not match the paper. Here is Figure 2B from the paper:

And here is what the code produces:

The curves are the same (apart from vertical scale), but the figure in the paper is missing the first 12 estimates for R, and the first few estimates that follow those are noticeably different.

I found that an exact reproduction of Figures 2B and 2C in the paper *can* be obtained by re-running the code for estimating R using a data file in which the first eleven ~~estimates for R~~ weeks of coronavirus data have been deleted, and in which the code has been changed to say that the first season starts on 2014-09-28 (rather than 2014-07-06). Here is the perfectly-matching result:

Perhaps Figures 2B and 2C in the paper were inadvertently produced using a file of R estimates created using preliminary code that for some reason treated the start of the season differently than the final version of the code. It’s unfortunate that the published figure is somewhat misleading regarding the match between the R estimates from the first part of the paper and the R estimates from the SEIRS model, since this match is significantly worse for the missing data points than for the others.

The final part of the paper extends the ODE model for the two common cold betacoronaviruses to include SARS-CoV-2, considering various possibilities for the characteristics of SARS-CoV-2, such as degree of seasonality and duration of immunity, as well as various interventions such as social distancing.

Although the general structure of this extended model is documented in the supplemental information, the only code relating to these simulations is in the third repository above, which appears to be for a preliminary version of the paper. This code is in the form of a Mathematica notebook (which can be viewed, though not executed, with the free program here). The figures in this notebook resemble those in Figure 3 of the paper, but do not match in detail.

A further-extended model is used to model scenarios regarding health care utilization, and described in the supplemental information. No code is available for this model.

This post has been largely confined to finding out whether the results in the paper can be reproduced, and if so how.

For the first part of the paper, in which estimates for R through time were made, and used to examine seasonal and immunity effects, I’ve been able to fully reproduce the results. For the second part, the SEIRS model for infection by common cold coronavirues, the results for specified parameter values can be reproduced, but the optimization method used to find best-fit parameters is not at at all reproducible from the information provided. The third part, in which future scenarios are simulated, is also not reproducible.

In future posts, I’ll discuss the substantive results in the paper, informed where possible by experiments that I can do now that I have code that reproduces some of the results in the paper. I’ll also consider possible improvements in the methods used.

]]>

There is reason to hope that it will, since in temperate climates influenza and the four coronaviruses that are among the causes of the “common cold” do follow a seasonal pattern, with many fewer cases in the summer. If COVID-19 is affected by season, this would obviously be of importance for policies regarding “lockdown” and provision of health care resources. Furthermore, understanding the reasons for seasonal variation might point towards ways of controlling the spread of COVID-19 (caused by a coronavirus sometimes referred to as SARS-CoV-2, though I’ll usually ignore this pedantic distinction).

I’ll look here at the evidence for seasonality in influenza and the common cold coronaviruses, and to what extent one might expect COVID-19 to also be seasonal. I’ll consider three classes of possible reasons for seasonality — seasonal changes in virus survival and transmissibility, in human resistance to infection, and in social behaviour. I’ll then consider whether we might be able to enhance such seasonal effects, further reducing the spread of COVID-19 in summer, and also extend these effects to winter.

To start, here’s some of my recently-acquired amateur knowledge of virology, from various sources (eg, Fenner and White’s Medical Virology, contents currently available online):

- Viruses have a protein “capsid” enclosing their genetic material (RNA or DNA).
- In “enveloped” viruses, the capsid is itself surrounded by a membrane derived from a membrane of the host.
- The viral envelope is generally needed for infectivity.
- Enveloped viruses are generally more vulnerable to environmental influences.

The virus causing COVID-19 as well as other coronaviruses are enveloped, as is influenza virus. Of the viruses responsible for the “common cold”, rhinoviruses and respiratory adenoviruses are non-enveloped, while coronaviruses, human metapneumoviruses, human parainfluenza viruses, and respiratory syncytial viruses are enveloped.

Infection with influenza or common cold viruses typically leads to at least some degree of immunity in the survivors, but this immunity may be short lived (eg, a year or so), due either to the person’s immune response fading, or to the virus mutating to a form less recognized by the immune system. Note that the “common cold” can be caused by various unrelated viruses, so it is not uncommon to get multiple colds in a year, even if one is immune to the viruses previously contracted. There are also multiple strains of influenza virus, with immunity to one not conferring complete immunity to the others.

The enveloped common cold viruses display seasonal patterns in temperate climates (usually, but not always, with a peak in winter/spring), while the non-enveloped common cold viruses typically do not show strong seasonality. For example, see Price, Graham, and Ramalingam, Association between viral seasonality and meteorological factors (also here). The review by Moriyama, *et al* is also of interest.

Seasonality can be seen in the data at CDC’s NREVSS site. For instance, here is a plot of the fraction of tests of US patients with respiratory illness who are positive for each of the four common cold coronaviruses (from here):

The plot goes from April 2018 to April 2020. The two biggest peaks are in early January of 2019 and 2020, with the first being for OC43 and the second for HKU1, which are betacoronaviruses. Peaks are also seen for NL63 and 229E, which are alphacoronaviruses. Note that COVID-19 is caused by a betacoronavirus.

Influenza (also an enveloped virus) displays seasonal patterns as well, with a winter/spring peak in temperate climates, and more complex patterns in tropical climates. For example, see Tamerius, *et al*. Global Influenza Seasonality: Reconciling Patterns across Temperate and Tropical Regions (also here), which has the following figure:

That enveloped respiratory viruses are typically seasonal and non-enveloped respiratory viruses are typically non-seasonal is consistent with a viral envelope being a vulnerability, with the degree of vulnerability varying with the season. The season might also affect host vulnerability, however — for example, via the effect of sunlight on vitamin D levels. Many social behaviours that could affect spread of viruses are also seasonal — for example, school schedules, sporting activities, holidays, and vacations. So the reasons for seasonality could be complex.

Despite the strong seasonality seen for many viruses, such as the influenza and common cold coronaviruses in the plots above, it is conceivable that the magnitude of the seasonal effect is actually quite small. This possibility is discussed by Dushoff, *et al*. in Dynamical resonance can account for seasonality of influenza epidemics. If immunity is short-lived, oscillations in the number of cases are expected as the number of susceptible individuals is reduced by a widespread infection, but then rises again as immunity fades or the virus mutates. If the period of such intrinsic oscillations is approximately a year, even a small seasonal effect on infectivity could have a synchronization effect that leads to the peaks of such oscillations always occurring in the same season, producing a seasonal effect that appears to be much larger than the actual seasonal forcing.

The reason(s) for seasonality of influenza and other viruses are not well established. The reviews by Tamerius, *et al* and Moriyama, *et al* consider many possibilities.

It could be that the ability of some viruses to survive outside the body, either in air or on surfaces, is affected by the season. For influenza, one popular theory is that low absolute humidity is conducive to virus survival and transmission (see Shaman and Kohn and Shaman, Goldstein, and Lipsitch, for example), though it seems possible that very high humidity is also conducive to virus transmission. Both indoor and outdoor absolute humidity is typically lower in winter than in summer. High temperature may also independently curtail virus transmission, and higher UV radiation in summer may reduce virus survival. It is possible that seasonal changes in humidity and temperature also affect host resistance to infection.

The higher UV radiation in summer will (amongst people who go outside without blocking it with clothing or lotions) also increase vitamin D production. Since vitamin D levels affect the immune system, this has been hypothesized as a reason for seasonality of influenza (eg, Cannell, *et al*, On the epidemiology of influenza, also here). Some subsequent randomized trials of vitamin D supplementation have supported this hypothesis (eg, Urashima, *et al*, looking at 334 Japanese schoolchildren), while others have not found such an effect (eg, Aloia, Islam, and Mikhail, looking at 184 older African-American women). A simulation study (Shaman, Jeon, Giovannucci, and Lipsitch) reported that the seasonality of influenza was more robustly reproduced based on absolute humidity than on vitamin D levels. One should note that sunlight has other possibly beneficial effects besides vitamin D production, such as increasing levels of nitric oxide, which reduces blood pressure.

Many social activities that could affect virus transmission vary seasonally, of which school attendance is perhaps the most impactful.

Seasonality seems to have been most studied for influenza, but as seen above, common cold coronaviruses are also seasonal. Both influenza and coronaviruses are enveloped, which might lead to similar seasonally-varying environmental sensitivities. Seasonal effects from host immunity and social behaviour might also be common to influenza and coronaviruses. So it’s natural to think that COVID-19 might also be seasonal.

There are really two questions regarding seasonality: First, if in future COVID-19 becomes an established human virus, will it produce seasonal epidemics, like influenza and the four common cold coronaviruses? And second, right now, during COVID-19’s initial expansion, will the pandemic slow down with the arrival of summer? As discussed above, the answer to the first question could be “yes”, but the answer to the second could be “not much” — even a small seasonal effect on infectivity could synchronize periodic outbreaks as immunity fades so they always occur in winter, while only slightly slowing the pandemic spread in a population with no immunity.

Marc Lipsitch (co-author of two papers cited above) answers the question — Seasonality of SARS-CoV-2: Will COVID-19 go away on its own in warmer weather? — with “probably not”, and more precisely “we may expect modest declines” in contagiousness, but “it is not reasonable to expect these declines alone to slow transmission enough to make a big dent”. The degree of certainty in this last statement seems unwarranted, considering that he also says that “Predicting how a novel virus will behave based on how others behave is always speculative”, unless he means only to say that a big dent from seasonal effects is far from being guaranteed.

One should note that the *possibility* that large seasonal peaks can result from small seasonal effects, due to synchronization of immunity-driven oscillations, does not necessarily imply that the seasonal effects for a particular viral disease *actually are* small — large seasonal effects could certainly also result in seasonality! And in fact, it seems that there may be some large-magnitude seasonal effects, such as that of absolute humidity on influenza as reported by Shaman and Kohn (bottom plots in Figure 1) and by Shaman, Goldstein, and Lipsitch (Figure 1).

The recent paper by Kissler, Tedijanto, Goldstein, Grad, and Lipsitch on Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period estimates the magnitude of the seasonal effect in the US for the two betacoronaviruses, HCoV-OC43 and HCoV-HKU1, and that are causes of the common cold, reporting: “According to the best-fit model parameters, the R_{0} for HCoV-OC43 and HCoV-HKU1 varies between 1.7 in the summer and 2.2 in the winter and peaks in the 2nd week of January”. This is a substantial seasonal effect. The seasonal effect for COVID-19 might be either larger or smaller. In their paper, Kissler, *et al* consider scenarios in which R_{0} for COVID-19 varies seasonally, and conclude that “SARS-CoV-2 can proliferate at any time of year”. However, this conclusion is based on simulations that assume R_{0} in summer is reduced by at most 40% compared to winter; the actual magnitude of the seasonal effect is of course unknown.

It is possible that COVID-19 is not currently spreading amongst a population with no immunity — there may be some immunity to COVID-19 arising from infection with the common-cold coronaviruses, especially the two betacoronaviruses. This is considered by Kissler, *et al*, on the assumption that any cross-immunity is the same for HCov-OC43 and HCoV-HKU1. As seen in the plot above, in the US cold season just ended, the peak for HCoV-HKU1 in January 2020 was greater than for HCoV-OC43, whereas the reverse was the case in January 2019. A difference in cross immunity with COVID-19 between these two strains would affect how much COVID-19 is currently being suppressed in the US by cross immunity, which in turn would affect expectations of its strength next fall, when immunity induced by these common cold coronaviruses would have had longer to fade. There is presumably geographic variation in which coronaviruses were most common recently.

If any of the possible seasonal influences discussed above actually apply to COVID-19, we might hope to deliberately enhance these influences, to further suppress the virus during the summer, and to extend these effects to winter as well. Some possible interventions of this sort seem to have a good chance of being quite effective, at substantially lower cost than the “lockdown” currently in effect in many countries, which is in any case not sustainable.

Higher absolute humidity is often seen as a factor for lower prevalence of influenza in summer, and may well inhibit COVID-19 as well. Indoor humidity is relatively easily controlled — in some buildings, by simply turning the control knob on an existing humidifier. Humidifiers are comparatively simple devices whose production could be scaled up quickly, at low cost compared to many current interventions.

There has been some, but really surprisingly little, work on humidification as a method for reducing respiratory infections. A 2018 study by Reiman, *et al* on Humidity as a non-pharmaceutical intervention for influenza A (also here) looked at the prevalence of influenza virus in air and on surfaces in classrooms with and without humidification, and found that the humidified classrooms had less virus. Moriyama, *et al* briefly mention maintaining indoor relative humidity at 40 to 60 percent at normal room temperature as a possible intervention to reduce transmission of respiratory viruses. This is also advocated here.

This level of relative humidity differs somewhat from current recommendations. In the US, the EPA recommends “…high humidity keeps the air moist and increases the likelihood of mold. Keep indoor humidity between 30 and 50 percent”, a recommendation echoed by the CDC, who also are concerned with mold, and make no mention of respiratory viruses. Research to clarify the relative health risks of mold from high humidity and of respiratory viruses from low humidity would clearly be helpful. In the meantime, it would seem prudent to maintain relative humidity at at least 50 percent, especially in buildings inhabited by vulnerable individuals.

An even lower cost intervention that has a substantial probability of reducing the spread of COVID-19 is to tell people to get some sunlight, or if they don’t, to at least take vitamin D supplements. As discussed above, there is some reason to believe that sunlight is a factor in the seasonality of influenza, probably through its effect on vitamin D levels, and perhaps also through other means. It may well be a factor for COVID-19 as well. As stated in this commentary by Haroon, *et al*:

Regular intake of standard-dose vitamin D is not associated with adverse effects such as hypercalcaemia or renal stones. This would be a safe intervention to recommend to at-risk population groups and strengthens the argument for increased vitamin D fortification at a population level. It is highly questionable to wait for the results of RCTs before this recommendation should be made.

There has for many years been much debate on the health benefits of sunlight and vitamin D supplementation, with some evidence implicating low levels of vitamin D in many health conditions, while other studies find no effect. There is, however, no reason to think that taking 1000 IU of vitamin D per day has any harmful effects, nor that moderate exposure to the sun is bad. Nevertheless at the Health Canada page on Sun safety basics we see exhortations to “Protect against UV rays all year round”; “Limit your time in the sun”; “When the UV index is 3 or higher, wear protective clothing, sunglasses, and sunscreen”. There is no recommendation to take vitamin D supplements. This advice can only be characterized as dangerous and irresponsible.

Another possible driver of seasonality for respiratory infections is social behaviour, with school attendance possibly being a major factor. Reducing this factor could be a considerable help in avoiding a resurgence of infection in the fall.

An obvious way to reduce virus transmission at school is to encourage home schooling. Trying to pressure families into homeschooling who would otherwise be disinclined to do so would not be appropriate, but for families who are considering this option, COVID-19 provided an additional reason to do so, and for governments to be supportive of this choice.

For children who are not home schooled, viral transmission will be reduced if they are in a smaller class. Cutting class sizes in half (say, from 20 students to 10 students) seems likely to substantial reduce transmission. (Schools that already have small classes of around 10 students would stay the same.)

This may seem infeasible — how do we quickly get twice as many teachers, and twice as many classrooms? There’s a simple solution: just cut the length of the school day in half. What used to be a class of 20 students becomes two classes of 10 students, one of which meets in the morning, the other in the afternoon, sharing the same teacher and classroom. As a bonus, the classroom designed for 20 students can now hold 10 students using desks that are further apart.

But won’t the amount children learn also be cut in half? No. First, it is widely accepted that students learn better in smaller classes. Beyond that, anyone who has experienced conventional primary education, or has children who have attended primary school, will be aware that less than half the day is devoted to learning in any case, and that the remainder can be cut with little loss, or indeed to good effect. For high school, some time could be switched to online learning, while in-person instruction, at a more individualized level, remains for those aspects of learning where it is most beneficial.

Where the students spend time outside their now-shorter school time is an obvious issue. The benefit is lost if they go to a daycare with 20 kids. Note that the school day is already shorter than the typical work day, so parents already have some scheme for child care. In light of COVID-19, in-person work days may well be shorter, with more work from home, and this could interact well with a shorter school day.

Aloia, Islam, and Mikhail, Vitamin D and Acute Respiratory Infections—The PODA Trial (also here).

Burrell, Howard, and Murphy, Fenner and White’s Medical Virology.

Cannell, *et al*, On the epidemiology of influenza (also here).

Dushoff, *et al*, Dynamical resonance can account for seasonality of influenza epidemics.

Haroon, *et al*, Should vitamin D supplementation be recommended to prevent COVID-19?.

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period.

Lipsitch, Seasonality of SARS-CoV-2: Will COVID-19 go away on its own in warmer weather?.

Moriyama, *et al*, Seasonality of Respiratory Viral Infections.

Price, Graham, and Ramalingam, Association between viral seasonality and meteorological factors (also here).

Reiman, *et al, * Humidity as a non-pharmaceutical intervention for influenza A (also here).

Shaman, Goldstein, and Lipsitch, Absolute Humidity and Pandemic Versus Epidemic Influenza.

Shaman, Jeon, Giovannucci, and Lipsitch, Shortcomings of Vitamin D-Based Model Simulations of Seasonal Influenza (also here).

Shaman and Kohn, Absolute humidity modulates influenza survival, transmission, and seasonality.

Tamerius, *et al*, Global Influenza Seasonality: Reconciling Patterns across Temperate and Tropical Regions (also here).

Urashima, *et al*, Randomized trial of vitamin D supplementation to prevent seasonal influenza A in schoolchildren.

So we expect that on a *logarithmic* plot of the cumulative number of cases or deaths over time, the curve will initially be a straight line, but later start to level off, approaching a horizontal line when there are no more new cases or deaths (assuming the disease is ultimately eliminated). And that’s what we mostly see in the data, except that we haven’t achieved a horizontal line yet.

On a *linear* plot of cases or deaths over time, we expect an exponentially rising curve, which also levels off eventually, ultimately becoming a horizontal line when there are no more cases or deaths. But that’s **not** what we see in much of the data.

Instead, for many countries, the linear plots of total cases or total deaths go up exponentially at first, and then approach a straight line that is not horizontal. What’s going on?

Before trying to answer this question, I’ll first illustrate the issue with some plots taken from https://www.worldometers.info/coronavirus/.

Here’s the linear plot of cases in the UK:

We see how the initial exponential rise changes into a linear rise from about April 3. We can also look at the daily number of cases:

From around April 3, the initial exponential rise in new cases changes to an approximately constant number of new cases each day.

It’s the same for deaths, with a small time lag:

Browsing around worldometers shows that the plots for many (though not all) countries are similar. For instance, Canada:

And the United States:

It’s not too hard to come up with a reason for the number of cases to rise linearly. One need only assume that the country has a limited, and fixed, testing capacity. Once the number of probable cases reaches this capacity, the number of confirmed cases can’t rise any faster than they can be tested, so even if the true number of cases is still rising exponentially, the number of reported cases rises only linearly. Now, one might instead think that testing capacity is being increased, or that cases are being reported as COVID-19 based on symptoms alone, without a confirming test, so this isn’t a completely convincing reason for linearity. But considering that the relationship of number of reported cases to number of actual cases may be rather distant, it’s maybe not too interesting to think about this further.

A linear rise in the number of deaths seems harder to explain.

The analogue of limited testing capacity would be limited hospital capacity. But for that to produce a linear rise in reported deaths, we’d need to assume not only that hospital capacity has been exceeded (which seems not to be true in most places) but also that only the deaths from COVID-19 that occur in hospitals are reported.

It’s possible to imagine situations where the true number of deaths rises exponentially at first, but then linearly. For example, people could live along a river. Infections start among people at the river’s mouth, and expand exponentially amongst them, until most of them are infected. It also spreads up the river, but only by local contagion, so the number of deaths (and cases) grows linearly according to how far up-river it has spread. This scenario, however, seems nothing like what we would expect in almost all countries.

Most countries have taken various measures to slow the spread of COVID-19. We might expect that in some countries, these measures are insufficient, and that the growth in total deaths (and daily deaths) is still exponential, just with a longer doubling time. We might expect that in some other countries, the measures are quite effective, so that the number of new deaths is now declining exponentially, with the plot of total deaths levelling off to a horizontal line. To get a linear growth in number of deaths, the measures taken would need to be just effective enough, but no more, that they lead to a constant number of deaths per day, neither growing or shrinking. Considering that various disparate measures have been taken, it seems like an unlikely coincidence for their net effect to just happen to be a growth rate of zero.

What’s left? Well, it could be that the growth in total deaths is not really linear, but is instead now growing exponentially with a very long doubling time, or is instead levelling off, but very slowly.

Certainly there are *some* countries where total deaths seem to be levelling off, such as Spain:

Maybe if we only looked at these plots up until about April 7, we’d see the growth in total deaths as being nearly linear (after about March 26), and the number of daily deaths as being almost constant.

But the number of countries where growth in total deaths is almost linear seems more than I’d expect. Could these countries have very precisely calibrated the magnitude of their infection control measures to keep the number of new cases constant, thereby avoiding overwhelming their health care systems at minimal social and economic cost? This seems unlikely to me. I’m puzzled.

]]>Here, I’ll look at some of the evidence on how body mass — formalized as Body Mass Index (BMI, weight in kilograms divided by squared height in metres) — influences prognosis for respiratory diseases. Information specific to COVID-19 is still scant, but there is more data on influenza and on other respiratory infections (which includes coronaviruses other than COVID-19). Information on how BMI relates to general mortality should also be helpful.

Below, I’ll look at two relevant papers, plus a preliminary report on COVID-19. To preview my conclusions, it seems that being underweight and being seriously obese are both risk factors for serious respiratory illness. Furthermore, it seems that “underweight” should include the lower part of the “normal weight” category as defined by the WHO. Official advice in this respect seems dangerously misleading.

The most relevant study addressing how body mass affects outcomes of respiratory disease that I could find is the following:

J.-A. S. Mozer, *et al* (2018) Underweight, overweight, and obesity as independent risk factors for hospitalization in adults and children from influenza and other respiratory viruses, *Influenza and Other Respiratory Viruses*. Available from the journal site, and also here.

This study looked at several thousand patients who sought medical attention for an influenza-like illness in Mexico between April 2010 and March 2014. Of these, 3248 were adults 19 or older who were not excluded for some reason (I will ignore the younger patients, which the paper analyses separately). Among these patients, 1083 were classified as having severe disease, as indicated by their having been hospitalized.

For each patient, they recorded age, sex, and presence of a chronic medical condition (yes/no). They converted BMI to one of the categories defined by the WHO:

underweight (BMI < 18.5) normal (18.5 ≤ BMI <25) overweight (25 ≤ BMI <30) obese (30 ≤ BMI <35) morbidly obese (BMI ≥ 35)

They used the “normal” category as the reference. They performed tests for presence of influenza virus (various strains) as well as several other respiratory viruses, including coronaviruses NL63, OC43, 229E, and (from June 2012 on) HKU1, classifying each patient as having influenza (and possibly other viruses), having non-influenza respiratory virus, or having no detected virus.

In their main analysis, the latter two groups were combined, resulting in a group of 553 patients with influenza and a group of 2695 patients with other viruses or no detected virus. Logistic regression models for whether or not the illness was severe were fit separately for these two groups, with age, sex, presence of chronic conditions, and BMI category as covariates. The results are shown in their Table 2, reproduced below:

If *p* is the probability of severe illness, the “odds” of severe illness are defined to be *p*/(1-*p*). The “odds ratios” shown above indicate how the modeled odds change if the covariate value is changed (changed by one year for age).

For example, if the model says that for a normal-weight influenza patient of a certain age, sex, and chronic medical condition status the odds of severe illness are 1.5 (corresponding to a probability of 0.6), the odds of severe illness for another influenza patient with the same covariates except that they are underweight would (according to the model) be 1.5×5.20=7.8 (corresponding to a probability of 0.89).

An odds ratio greater than 1 indicates that the risk of severe illness is greater for the “active” value of the covariate, odds ratios less than 1 that the risk is less. The numbers in parentheses are the confidence interval for the estimated odds ratios. A p-value of less than 0.05 is taken as an indication that the estimated odds ratio is likely to be on the correct side of 1, and such estimates are in bold (except that the overweight estimate for influenza is mistakenly bolded when it shouldn’t be). A p-value less than 0.05 also corresponds to the confidence interval for the odds ratio not containing the value 1.

Unfortunately, the paper fails to say which way around sex was coded, so we can’t tell whether it’s males or females who have a reduced risk if they have influenza (odds ratio of 0.54) and an increased risk if they have a different respiratory disease (odds ratio of 1.83). This result seems odd, but could be due to various non-obvious factors (eg, a difference in relative propensity of males and females to seek medical attention for influenza versus other illnesses).

The authors checked for interactions between the effect of BMI and other covariates, and fit additional logistic regression models to subsets of the data when interactions were significant. However, the model above, which divides the data only into influenza and non-influenza groups, captures their main conclusion, which is expressed in the first sentence of their discussion as follows:

We have demonstrated that adults that are underweight or morbidly obese are more likely to be hospitalized from an influenza-like illness, regardless of the causative agent of the illness, than normal-weight adults.

One should keep in mind that this is an observational study. The relationships seen between body mass and seriousness of illness do not necessarily represent cause and effect. For example, it could be that in Mexico smoking is especially common among underweight people, and that that, rather than their weight itself, is the reason why underweight patients had higher risk. This caveat is important if one is considering modifying BMI to reduce risk, but not if one is using BMI (without other information) to forecast outcomes in a similar population.

The data for this study isn’t publicly available, and the paper does not include full details of the analysis (eg, the output of the statistics package for all the models they fit). From just looking at the paper, though, I do not see any serious problems with the analysis that would invalidate their conclusion above.

The analysis could be improved, however. Looking at whether a possible interaction is statistically significant, and on that basis either ignoring the interaction or doing completely separate analyses for subgroups, is a rather crude approach. One could likely get better estimates (with smaller uncertainties) by building a hierarchical Bayesian model, in which many potential interactions can be modelled, while main effects can still predominate if appropriate. Similarly, grouping together all patients who don’t have influenza, ignoring the information on which specific viruses they have, throws away information that might produce useful insight if a hierarchical Bayesian approach were used, in which information regarding different viruses can be pooled to the extent that they seem to have similar effects, while still allowing any differences to be seen.

More seriously, the use of the WHO categories for BMI throws away important information, especially regarding relationships for low values of BMI. The odds ratios for serious illness for underweight patients versus normal weight patients are 5.20 for influenza and 2.88 for non-influenza illness. These odds ratios are quite substantial, and raise the question of whether patients towards the low end of the WHO “normal” weight range also have substantially higher risk of serious illness.

To help answer this question, and to shed some light on whether there is a cause-and-effect relationship of BMI to serious illness, I’ll next look at studies on how BMI relates to mortality, from all causes, not just respiratory illness.

The following large-scale study assessed the impact of BMI on mortality from all causes, while trying to avoid problems of “reverse causation” (ie, that people may have lower weight because they’re dying, rather than be dying because they have lower weight):

The Global BMI Mortality Collaboration (2016) Body-mass index and all-cause mortality: individual-participant-data meta-analysis of 239 prospective studies in four continents, *The Lancet*, available from the journal site, and also here. There is an appendix of supplemental material (here or here). See also the editorial comment (here or here).

They looked at estimates from 239 earlier studies regarding the impact of BMI on mortality, and combined them (accounting for their uncertainties) to give overall estimates of how BMI relates to mortality, as well as estimates for sub-groups according to sex, age, and geographical region. They primarily looked at mortality 5 or more years after the measurement of BMI (to avoid reverse causation) amongst adults who never smoked (to avoid confounding due to smoking being a cause of lower weight) and who did not have any of various chronic diseases when the BMI was measured (again to avoid reverse causation). Here are their estimates for all regions and all ages (from 20 to 90 years), but separated by sex:

The “hazard” gives the probability of dying in a small period of time (eg, one year) in the immediate future. The modelled hazards for groups of people with differing characteristics can be compared by the “hazard ratio” for the two groups. The plot above shows hazard ratios with respect to people with BMI between 22.5 and 25.0 as the reference (ie, that group’s hazard ratio is fixed at 1). Note that the vertical scale is logarithmic (each division represents a doubling of the hazard). The vertical lines on each point give the uncertainty range; the size of the square shows how much information the estimate is based on.

This study used more BMI categories than those defined by the WHO, which were used in the first study I discussed above. Here is the finer division used in this study, and the correspondence with the larger WHO categories:

15.0-18.5 underweight 18.5-20.0 normal 20.0-22.5 " 22.5-25.0 " 25.0-27.5 overweight 27.5-30.0 " 30.0-35.0 obese 35.0-40.0 morbidly obese 40.0-60.0 "

One conclusion we might draw from this plot is noted in the paper:

…there was a substantially higher mortality not only among those in WHO’s underweight category, but also in those with BMI 18·5 kg/m2 to <20 kg/m2, suggesting that in excessively lean adult populations underweight remains a cause for concern.

The study also looks at mortality due to various causes, including respiratory disease. Hazard ratios for death from respiratory disease are shown in the following plot:

From Figure 4

Respiratory diseases include respiratory infections such as influenza and coronavirus, but also some chronic conditions, so this is not a perfect match to our current interest. It nevertheless seems notable that risk of death from respiratory disease is elevated not only in the WHO “underweight” category (BMI<18.5) but also in the parts of the WHO “normal weight” category with BMI between 18.5 and 22.5. This strengthens the suspicion that the high odds ratios for influenza-like illness being serious found in the first study I discussed, for patients with BMI<18.5, may also apply to patients in the lower part of the WHO “normal weight” range. The worry that the results for influenza-like illness might be affected by reverse causality or confounding by smoking is also lessened by the results of this study, which was designed to avoid these problems.

Considering that COVID-19 and influenza tend to be more serious in older patients, it is of interest that the study also looks at how the mortality to BMI relationship differs by age. This plot shows how the hazard ratios vary with both BMI and age (with age 40-59 and BMI 22.5-25 as reference):

From eFigure 7 in the supplementary information

The BMI range with the smallest estimated hazard ratio increases with age. For the 80+ age group the minimum is in the BMI 25.0 to 27.5 range, the low end of the WHO “overweight” category, and for the 70-79 age group this category and the top end of the “normal weight” category (BMI 22.5 to 25.0) have very nearly the same hazard ratio. The paper finds that the minimum hazard ratio for people age 70 and above is at a BMI of 24, but this conclusion is likely highly influenced by the functional form of the curve that they fit — visually a value above 25 seems more plausible. The claim in the paper that

This analysis has shown that both overweight and obesity (all grades) were associated with increased all-cause mortality…Our results challenge recent suggestions that overweight and moderate obesity are not associated with higher mortality…

does not seem to be entirely supported by their results, with respect to people age 70 and above.

The paper unfortunately gives no breakdown by age for different causes of death. Comparing the plots above for all-cause mortality and for mortality due to respiratory disease, we can see that the BMI with minimum hazard ratio is larger for respiratory disease mortality than for all-cause mortality by roughly 2 units. It is not clear that the same shift will apply to all age groups, but it does seem likely that for people age 70 and above — those most at risk from COVID-19 and influenza — the minimum risk for mortality from respiratory disease is for a BMI in the “overweight” category, there is substantial increased risk for BMI less than 20 or more than 35, and there is moderately increased risk for BMI in the range 20-22.5 or in the range 30-35.

Of course, it is not clear that these results for influenza-like illness and respiratory disease in general will hold for a new disease such as COVID-19.

Some data on 775 COVID-19 patients admitted to critical care in the UK is now available here. The source is as follows:

These data derive from the ICNARC Case Mix Programme Database. The Case Mix Programme is the national clinical audit of patient outcomes from adult critical care coordinated by the Intensive Care National Audit & Research Centre (ICNARC). For more information on the representativeness and quality of these data, please contact ICNARC.

Some commentary on this preliminary release of data can be seen here. As noted in this commentary and in the report itself, the data on outcomes for patients admitted to a critical care unit is biased because not all patients have an outcome yet. (Outcomes that happen sooner will be overrepresented.) So I’ll look only at what might be inferred from the data on patients admitted to critical care, as compared to the general population. (But note that since the epidemic has not emerged uniformly over the UK, the general population is not quite the right group to compare with.) The report also has data on viral pneumonia patients (non-COVID-19) admitted from 2017 to 2019, which provides another helpful comparison.

One notable and somewhat puzzling fact is that compared to the viral pneumonia patients, the COVID-19 patients are much less likely to have very severe comorbidities. It’s not clear whether this is because this information is not being completely recorded, or because COVID-19 patients with severe comorbidities die before they can get to critical care, or because such patients are refused admission to critical care, or because (most likely, I think) this is due to COVID-19 being unusually dangerous for mostly-healthy people. The proportion of male patients is also higher (70.5%) for COVID-19 than for non-COVID-19 viral pneumonia (54.3%). These are indications that COVID-19 and non-COVID-19 viral pneumonia are not entirely similar, so some caution is warranted when trying to apply results from other diseases to COVID-19.

The report has data on BMI for the COVID-19 patients, along with age and sex matched BMI figures for the general population. Unfortunately, they use broad categories similar to the WHO categories. The BMI<18.5 category has only 9 patients, too few for meaningful results. For the others, the proportion in each category is as follows (Table 1 and Figure 5):

BMI COVID-19 general population odds ratio 18.5 to <25 26.6% 26.0% 1.00 (reference) 25 to <30 34.4% 42.0% 0.80 30 to <40 31.0% 28.5% 1.06 40+ 6.7% 2.5% 2.62

I’ve added the odds ratios for admission to critical care for COVID-19, computed from the other numbers. Since only a small fraction of the population is admitted to critical care, the odds ratio here is the same as the “relative risk”. (I couldn’t calculate odds ratios for non-COVID-19 viral pneumonia due to the lack of figures on BMI in the general population adjusted for age and sex to match that data.)

It seems that having a BMI of 40 or more (extreme obesity) substantially increases the risk of getting a serious case of COVID-19. Less extreme obesity may lead to a small increase in risk. Being “overweight” according the WHO categorization (BMI from 25 to 30) seems to reduce the risk by a fair amount, compared to having a “normal weight” (BMI from 18.5 to 25).

We can again see here the bad effects of people using the WHO BMI categories, especially the supposedly “normal weight” category, which as I discuss above includes a range from 18.5 to 20.0 that has elevated mortality from all causes, and especially from respiratory disease, as well as the range from 20.0 to 22.5 which also has elevated respiratory disease mortality. Because of its failure to give a finer breakdown by BMI, we cannot tell from this report whether these lower weight categories have higher risk of requiring critical care for COVID-19. We also cannot tell how the risk in the upper end of the “normal weight” category (BMI from 22.5 to 25) compares to the risk in the “overweight” category (BMI 25 to 30). The “overweight” category itself could usefully be broken down into two categories, as in the study of all-cause mortality discussed above.

UPDATE: I now discuss a newer version of this data at the end of this post.

The studies I’ve reviewed above support one official COVID-19 assessment that is seen, for example, in this statement from the CDC (see here):

Groups at Higher Risk for Severe Illness

…

People with severe obesity (body mass index [BMI] of 40 or higher)

It might even be justified to extend this to BMI of 35 or greater, though I think going any lower would be dubious.

However, I’ve seen no official advice that a low BMI puts one at risk of severe illness and death. I have come across other indications along these lines, however. For example, this quote from a doctor (see here):

When you are critically ill you lose 7 per cent of your muscle mass for every day that you are on a ventilator – it becomes clear why, if you are 85 and weigh 45 kilos, we are not keen to put those people on ventilators, irrespective of Covid.

Indeed, some official advice tells you that a BMI of 18.5 is just fine. Here are excerpts from the Canadian Guidelines for Body Weight Classification in Adults:

Research studies in large groups of people have shown that the BMI can be classified into ranges associated with health risk. There are four categories of BMI ranges in the Canadian weight classification system. These are:

- underweight (BMI less than 18.5);
- normal weight (BMIs 18.5 to 24.9);
- overweight (BMIs 25 to 29.9), and
- obese (BMI 30 and over).
Q10: I am concerned that my 25-year old daughter is underweight, but according to the weight classification guidelines she is within the ‘normal weight’ category.

A: In general, Canadian adults who have a BMI within the ‘normal weight’ category have the least risk of developing weight-related health problems…

It is pretty clear that this is dangerously false statement. It’s especially dangerous at the present time, since low weight is more of a risk factor for respiratory illness than for some other diseases.

Recognizing the likely risk to people with low BMI (around 20 or less) would influence public health policy. First, it would mean devoting effort to better estimating the risk of low BMI, for COVID-19 and other diseases. A good start on this would be to stop grouping all individuals with BMI in the 18.5 to 25 range together as “normal weight”, discarding the information that is needed to investigate this issue. It would also affect where interventions are directed. In particular, in some parts of the world many people have low BMI due to food shortage. COVID-19 may be a reason to ramp up efforts to solve this problem.

At an individual level, BMI is something that can potentially be modified, as a way to reduce risk (from COVID-19 or generally). Some caution is needed here. For one thing, BMI is not a perfect measure of underlying physical condition. Perhaps there are some people for whom a BMI of 18.5 is healthy, due to unusual genetic factors, for example. Also, attempts to drastically change BMI in a short period of time could be counterproductive. Certainly, people whose BMI has been changed by such a drastic effort are not typical subjects of the studies I reviewed above, so the results of those studies may not apply to them.

Nevertheless, I think people with a BMI over 35 should endeavour not to gain weight, and people with a BMI under 20 should take care not to lose weight — both unfortunately possible as people’s routines change while avoiding COVID-19. It might also be good for people to undertake moderate efforts to move their BMI into a range from about 21 to 27 (perhaps slightly higher for people over 70).

**UPDATE (2020-04-11):** There is now an updated report from ICNARC on UK patients with COVID-19 in critical care (here, with an appendix). They now have information on 3883 COVID-19 patients in critical care. This new data doesn’t substantially change the conclusions I drew above from the earlier report on 775 patients, but does allow more precise statements.

Here’s a update with the new data on the odds ratios for entering critical care for COVID-19 for different BMI categories:

BMI critical care general odds for COVID-19 population ratio <18.5 0.64% 0.64% 0.99 (0.65,1.51) 18.5 to <25 25.79% 25.37% 1.00 (reference) 25 to <30 35.04% 42.28% 0.82 30 to <40 31.33% 28.87% 1.07 40+ 7.20% 2.84% 2.49 (2.19,2.83)

I’ve now put in a line for BMI<18.5, since with 21 cases in this category some meaningful information can be obtained. I’ve included 95% confidence intervals for the odds ratio for this category and for the BMI 40+ category (with 235 cases); the other categories have 842 to 1144 cases and so have more precise estimates.

Note that the odds ratios in this situation are the same as the relative risks (ratios of probabilities of being in critical care) with respect to the reference category, since the probability of someone in the UK being in critical care for COVID-19 is very small.

]]>FBM implements several general-purpose Markov chain sampling methods, such as Metropolis updates, Hamiltonian (Hybrid) Monte Carlo, and slice sampling. These methods can be applied to distributions defined by simple formulas, including posterior distributions for simple Bayesian models. Several more specialized modules have been written that implement posterior distributions for more complex models, including Bayesian neural networks, Gaussian process models, and mixture models (including Dirichlet process mixture models).

Compared to the version from 2004, additions and improvements include:

- A more user-friendly syntax for specifying neural network architectures.
- A somewhat rudimentary module for simulating molecular systems with the Lennard-Jones potential, in NVT and NPT ensembles. This is meant to support research into how well MCMC methods work in this application area.
- A module implementing a specialized Bayesian model for inferring the location of sources of atmospheric contamination.
- An ‘slevel’ Markov chain operation, that supports updating the Uniform(0,1) value used to make accept/reject decisions, as part of the Markov chain state.
- Numerous detailed improvements and bug fixes.

One thing to note is that the computation times in the examples are for a long-obsolete computer. Compute times on a modern desktop computer will likely be smaller by a factor of ten or more.

This software comes with extensive documentation, including tutorial examples, which may be accessed here.

]]>Click on image for larger version.

Canal du Midi, Toulouse, July 2019. Nikon FG, Nikon Series E 50mm 1:1.8 lens, Kodak Color Plus 200 film, Nikon Coolscan V.

]]>