Avoiding Self Transitions in Gibbs Sampling

I have a new paper on Modifying Gibbs sampling to avoid self transitions. The idea is that an ordinary Gibbs sampling update for a variable will often choose a new value that is the same as the old value. That seems like a waste of time, and would better be avoided.

It’s not a new idea. I have long been aware of a method from 1996 due to Jun Liu, which reduces self transitions by replacing Gibbs sampling with Metropolis-Hastings updates using a proposal to change to a different value. This reduces self transitions, though not to the minimum possible, since the proposal may be rejected.

Then, a while ago, I came across a method described in 1992 by Frigessi, Hwang, and Younes that reduces self transitions further than Liu’s method. This was independently rediscovered by Tjelmeland in 2004. But it seems it’s not well known, partly because neither of their papers really promotes it. It occurred to me that it might be good to write a short note, maybe five pages long, describing these methods and comparing them.

Then I came across some methods by Suwa and Todo, from 2010 and 2022, that reduce self transitions to the theoretical minimum, using an entirely different approach.

I also thought of a new method, sort of the reverse of the Frigessi, et al method. And then I thought of a modification of this new method that also reduces self transitions to the theoretical minimum. And then I thought of another completely different class of methods, including one that also achieves the theoretical minimum.

Along the way, I wondered what one can prove about these methods. Liu’s method, and that of Frigessi, et al and Tjelmeland, can be shown to be better than Gibbs sampling (in the context of random selection of a variable to update) using Peskun’s Theorem, because they are reversible methods that always increase every probability for moving to a different value. But Peskun’s Theorem doesn’t apply to the methods of Suwa and Todo, or to any of my new methods. Some of these methods aren’t reversible, and those that are reversible can decrease some of the probabilities for moving to different values.

There are actually two theoretical issues here. The first is whether, when seen as a local update for a single variable, one of these new methods is superior to Gibbs sampling. The second is whether such superiority carries over to a full MCMC scheme, in which all the variables are updated in some fashion.

These questions inspired a paper by myself and Jeffrey Rosenthal, on Efficiency of reversible MCMC methods: elementary derivations and applications to composite methods. As the title indicates, we only deal with reversible update methods, with the application also limited to full schemes in which the variable to update is selected randomly. But in that context we do show how variable updates can be shown to improve on Gibbs sampling, and how such improvements can carry over to methods that combine such updates. We also provide a self-contained and accessible derivation of the needed theory.

So, with the theoretical issues dealt with, I could get back to my original paper. It turned out to be a bit longer than the five pages I’d planned. About seventy-nine pages longer.

Some of these pages are devoted to details of efficient implementations of all the methods. There are also extensive empirical assessments of performance on four different problems. In addition to showing how well the different methods perform, these experiments show some interesting phenomena regarding the choice of order to update variables and of using “thinned” estimates. There’s a github repository with the code and results.

Though several of the methods I tested performed well, I think one of my new methods is likely to be the most robust, and also has the best theoretical guarantees. I call this method ZDNAM, which stands for Zero-self Downward Nested Antithetic Modification (maybe a bit of a mouthful). If you’ve been using Gibbs sampling, for discrete variables, you might give it a try!

2024-03-28 at 10:11 pm Leave a comment

Staggered Stream

1-stream-tiny

(more…)

2024-03-28 at 10:10 pm Leave a comment

Plotting from the command line — a new version of ‘graph’

I’ve forked the GNU ‘plotutils’ package, written mostly by Rob Maier, which contains the ‘graph’ program. I’ve added new features to ‘graph’ to make it a better tool for producing plots from the Linux/Unix/macOS command line.

My main motivation is to use it with my Software for Flexible Bayesian Modeling (FBM), which consists of a set of programs to be run from the command line. One crucial FBM program is ‘net-plt’, which displays information from a log file for an MCMC run of a Bayesian neural network model. Here’s an example use of ‘graph’ to display the average squared error on training and test cases with networks from an MCMC run that is used as an example in the FBM documentation:

net-plt t bB rlog.net | graph -n

And here is the resulting plot:

(more…)

2020-12-22 at 11:23 pm Leave a comment

New version of pqR, with automatic differentiation and arithmetic on lists

I’ve released pqR-2020-07-23, a new version of my variant implementation of R.  You can install it on Linux, Windows, or Mac as described at pqR-project.org. Installation must currently be from source, similarly to source installs of R Core versions of R.

This version has preliminary implementations of automatic differentiation and of arithmetic on lists. These are both useful for gradient-based optimization, such as maximum likelihood estimation and neural network training, as well as gradient-based MCMC methods. List arithmetic is helpful when dealing with models that have several groups of parameters, which are most conveniently represented using a list of vectors or matrices, rather than a single vector.

You can read the documentation on these facilities here and here. Some example programs are in this repository. I previously posted about the automatic differentiation facilities here. Automatic differentiation and arithmetic on lists for pqR are both discussed in this talk, along with some other proposals.

For the paranoid, here are the shasum values for the compressed and uncompressed tar files that you can download from pqR-project.org, allowing you to verify that they were downloaded uncorrupted:

c1b389861f0388b90122cbe1038045da30879785 pqR-2020-07-23.tar.gz
04b4586601d8796b12c310cd4bf81dc057f33bb2 pqR-2020-07-23.tar

2020-07-25 at 1:38 pm Leave a comment

Critique of “Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period” — Part 4: Modelling R, seasonality, immunity

In this post, fourth in a series (previous posts: Part 1, Part 2, Part 3), I’ll finally talk about some substantive conclusions of the following paper:

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period, Science, vol. 368, pp. 860-868, 22 May 2020 (released online 14 April 2020).  The paper is also available here, with supplemental materials here.

In my previous post, I talked about how the authors estimate the reproduction numbers (R) over time for the four common cold coronavirus, and how these estimates could be improved. In this post, I’ll talk about how Kissler et al. use these estimates for R to model immunity and cross-immunity for these viruses, and the seasonal effects on their transmission. These modelling results inform the later parts of the paper, in which they consider various scenarios for future transmission of SARS-CoV-2 (the coronavirus responsible for COVID-19), whose characteristics may perhaps resemble those of these other coronaviruses.

The conclusions that Kissler et al. draw from their model do not seem to me to be well supported. The problems start with the artifacts and noise in the proxy data and R estimates, which I discussed in Part 2 and Part 3. These issues with the R estimates induce Kissler et al. to model smoothed R estimates, which results in autocorrelated errors that invalidate their assessments of uncertainty. The noise in R estimates also leads them to limit their model to the 33 weeks of “flu season”; consequently, their model cannot possibly provide a full assessment of the degree of seasonal variation in R, which is one matter of vital importance. The conclusions Kissler et al. draw from their model regarding immunity and cross-immunity for the betacoronavirues are also flawed, because they ignore the effects of aggregation over the whole US, and because their model is unrealistic and inconsistent in its treatment of immunity during a season and at the start of a season. A side effect of this unrealistic immunity model is that the partial information on seasonality that their model produces is biased.

After justifying these criticisms of Kissler et al.’s results, I will explore what can be learned using better incidence proxies and R estimates, and better models of seasonality and immunity.

The code I use (written in R) is available here, with GPLv2 licence.

(more…)

2020-07-06 at 10:53 pm Leave a comment

Critique of “Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period” — Part 3: Estimating reproduction numbers

This is the third in a series of posts (previous posts: Part 1, Part 2, next post: Part 4) in which I look at the following paper:

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period, Science, vol. 368, pp. 860-868, 22 May 2020 (released online 14 April 2020).  The paper is also available here, with supplemental materials here.

In this post, I’ll look at how the authors estimate the reproduction numbers (R) over time for the four common cold coronavirus, using the proxies for incidence that I discussed in Part 2. These estimates for R are used to model immunity and cross-immunity for these viruses, and the seasonal effects on their transmission. These modelling results inform the later parts of the paper, in which they consider various scenarios for future transmission of SARS-CoV-2 (the coronavirus responsible for COVID-19), whose characteristics may perhaps resemble those of these other coronaviruses.

I will be using the code (written in R) available here, with GPLv2 licence, which I wrote to replicate the results in the paper, and which allows me to more easily produce plots to help understand issues with the methods, and to try out alternative methods that may work better, than the code provided by the authors (which I discussed in Part 1). (more…)

2020-06-24 at 6:09 pm 1 comment

Critique of “Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period” — Part 2: Proxies for incidence of coronaviruses

This is the second in a series of posts (previous post: Part 1, next post: Part 3) in which I look at the following paper:

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period, Science, vol. 368, pp. 860-868, 22 May 2020 (released online 14 April 2020).  The paper is also available here, with supplemental materials here.

In this post, I’ll start to examine in detail the first part of the paper, where the authors look at past incidence of “common cold” coronaviruses, estimate the viruses’ reproduction numbers (R) over time, and use those estimates to model immunity and cross-immunity for these viruses, and seasonal effects on their transmission. The results of this part inform the later parts of the paper, in which they model the two common cold betacoronaviruses together with SARS-CoV-2 (the virus for COVID-19), and look at various scenarios for the future, varying the duration of immunity for SARS-CoV-2, the degree of cross-immunity of SARS-CoV-2 and common cold betacoronaviruses, and the effect of season on SARS-CoV-2 transmission.

In my previous post, I used the partial code released by the authors to try to reproduce the results in the first part of the paper. I was eventually able to do this. For this and future posts, however, I will use my own code, with which I can also replicate the paper’s results. This code allows me to more easily produce plots to help understand issues with the methods, and to try out alternative methods. The code (written in R) is available here, with GPLv2 licence. The data used is also included in this repository.

In this second post of the series, I examine how Kissler et al. produce proxies for the incidence of infection in the United States by the four common cold coronaviruses. I’ll look at some problems with their method, and propose small changes to try to fix them. I’ll also try out some more elaborate alternatives that may work better.

The coronavirus proxies are the empirical basis for the remainder of paper. (more…)

2020-06-17 at 8:55 pm 6 comments

Critique of “Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period” — Part 1: Reproducing the results

UPDATES: Next post in series: Part 2. Minor fix at strikethrough before last figure.

I’ve been looking at the following paper, by researchers at Harvard’s school of public health, which was recently published in Science:

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch (2020) Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period (also available here, with supplemental materials here).

This is one of the papers referenced in my recent post on seasonality of COVID-19. The paper does several things that seem interesting:

  • It looks at past incidence of “common cold” coronaviruses, estimating the viruses’ reproduction numbers (R) over time, and from that their degrees of cross-immunity and the seasonal effect on their transmission.
  • It fits an ODE model for the two common cold betacoronaviruses, which are related to SARS-CoV-2 (the virus for COVID-19), using the same data.
  • It then adds SARS-CoV-2 to this ODE model, and looks at various scenarios for the future, varying the duration of immunity for SARS-CoV-2, the degree of cross-immunity of SARS-CoV-2 and common cold betacoronaviruses, and the effect of season on SARS-CoV-2 transmission.

In future posts, I’ll discuss the substance of these contributions. In this post, I’ll talk about my efforts at reproducing the results in the paper from the code and data available, which is a prerequisite for examining why the results are as they are, and for looking at how the methods used might be improved.

I’ll also talk about an amusing / horrifying aspect of the R code used, which I encountered along the way, about CDC data sharing policy, and about the authors’ choices regarding some graphical presentations. (more…)

2020-05-27 at 2:06 pm 5 comments

Seasonality of COVID-19, Other Coronaviruses, and Influenza

Will the incidence of COVID-19 decrease in the summer?

There is reason to hope that it will, since in temperate climates influenza and the four coronaviruses that are among the causes of the “common cold” do follow a seasonal pattern, with many fewer cases in the summer. If COVID-19 is affected by season, this would obviously be of importance for policies regarding “lockdown” and provision of health care resources. Furthermore, understanding the reasons for seasonal variation might point towards ways of controlling the spread of COVID-19 (caused by a coronavirus sometimes referred to as SARS-CoV-2, though I’ll usually ignore this pedantic distinction).

I’ll look here at the evidence for seasonality in influenza and the common cold coronaviruses, and to what extent one might expect COVID-19 to also be seasonal. I’ll consider three classes of possible reasons for seasonality — seasonal changes in virus survival and transmissibility, in human resistance to infection, and in social behaviour. I’ll then consider whether we might be able to enhance such seasonal effects, further reducing the spread of COVID-19 in summer, and also extend these effects to winter. (more…)

2020-04-30 at 5:40 pm 2 comments

The Puzzling Linearity of COVID-19

We all understand how the total number of cases of COVID-19 and the total number of deaths due to COVID-19 are expected to grow exponentially during the early phase of the pandemic — every infected individual is in contact with others, who are unlikely to themselves be infected, and on average infects more than one of them, leading to the number of cases growing by a fixed percentage every day. We also know that this can’t go on forever — at some point, many of the people in contact with an infected individual have already been infected, so they aren’t a source of new infections. Or alternatively, people start to take measures to avoid infection.

So we expect that on a logarithmic plot of the cumulative number of cases or deaths over time, the curve will initially be a straight line, but later start to level off, approaching a horizontal line when there are no more new cases or deaths (assuming the disease is ultimately eliminated). And that’s what we mostly see in the data, except that we haven’t achieved a horizontal line yet.

On a linear plot of cases or deaths over time, we expect an exponentially rising curve, which also levels off eventually, ultimately becoming a horizontal line when there are no more cases or deaths. But that’s not what we see in much of the data.

Instead, for many countries, the linear plots of total cases or total deaths go up exponentially at first, and then approach a straight line that is not horizontal. What’s going on? (more…)

2020-04-23 at 3:08 pm 10 comments

Older Posts


Categories

  • Blogroll

  • Feeds