Posts filed under ‘Statistics – Computing’

Two Hamiltonian Monte Carlo papers

Two papers involving Hamiltonian Monte Carlo (HMC) have recently appeared on arxiv.org — Jascha Sohl-Dickstein’s Hamiltonian Monte Carlo with reduced momentum flips, and Jascha Sohl-Dickstein and Benjamin Culpepper’s Hamiltonian annealed importance sampling for partition function estimation.

These papers both relate to the variant of HMC in which momentum is only partially refreshed after each trajectory, which allows random-walk behaviour to be suppressed even when trajectories are short (even just one leapfrog step). This variant is described in Section 5.3 of my HMC review. It seems that the method described in the first paper by Sohl-Dickstein could be applied in the context of the second paper by Sohl-Dickstein and Culpepper, but if so it seems they haven’t tried it yet (or haven’t yet written it up).
(more…)

2012-05-21 at 4:06 pm 2 comments

Non-random MCMC

In my post on MCMC simulation as a random permutation (paper available at arxiv.org here), I mentioned that this view of MCMC also has implications for the role of randomness in MCMC. This has also been discussed in a recent paper by Iain Murray and Lloyd Elliott on Driving Markov chain Monte Carlo with a dependent random stream.

For the simple case of Gibbs sampling for a continuous distribution, Murray and Elliott’s procedure is the same as mine, except that they do not have the updates of extra variables needed to produce a volume-preserving map. These extra variables are relevant for my importance sampling application, but not for what I’ll discuss here. The method is a simple modification of the usual Gibbs sampling procedure, assuming that sampling from conditional distributions is done by inverting their CDFs (a common method for many standard distributions). It turns out that after this modification, one can often eliminate the random aspect of the simulation and still get good results! (more…)

2012-05-03 at 11:37 pm 5 comments

MCMC simulation as a random permutation

I’ve just finished a new paper. Continuing my recent use of unwieldy titles, I call it “How to view an MCMC simulation as a permutation, with applications to parallel simulation and improved importance sampling”.

The paper may look a bit technical in places, but the basic idea is fairly simple. I show that, after extending the state space a bit, it’s possible to view an MCMC simulation (done for some number of iterations) as a randomly selected map from an initial state to a final state that is either a permutation, if the extended state space is finite, or more generally a one-to-one map that preserves volume.

Why is this interesting? I think it’s a useful mathematical fact — sort of the opposite of how one can “couple” MCMC simulations in a way that promotes coalescence of states. It may turn out to be applicable in many contexts. I present two of these in the paper. (more…)

2012-05-02 at 6:36 am 1 comment

Evaluation of NUTS — more comments on the paper by Hoffman and Gelman

Here is my second post on the paper by Matthew Hoffman and Andrew Gelman on “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo”, available from arxiv.org. In my first post, I discussed how well the two main innovations in this “NUTS’” method — ending a trajectory when a “U-Turn” is encountered, and adaptively setting the stepsize — can be expected to work. In this post, I will discuss the empirical evaluations in the NUTS paper, and report on an evaluation of my own, made possible by the authors having kindly made available the NUTS software, concluding that the paper’s claims for NUTS are somewhat overstated. The issues I discuss are also of more general interest for other evaluations of HMC. (more…)

2012-01-27 at 4:31 pm 14 comments

No U-Turns for Hamiltonian Monte Carlo – comments on a paper by Hoffman and Gelman

Matthew Hoffman and Andrew Gelman recently posted a paper called “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo” on arxiv.org. It has been discussed on Andrew’s blog.

It’s a good paper, which addresses two big barriers to wider use of Hamiltonian Monte Carlo — the difficulties of tuning the trajectory length and tuning the stepsize to use when simulating a trajectory. The name “No-U-Turn Sampler” (NUTS) comes from their way of addressing the problem of tuning the trajectory length — repeatedly double the length of the current trajectory, until (simplifying a bit) there is a part of the trajectory that makes a “U-Turn”, heading back towards its starting point. This doubling method is clever, and (as I discuss below) one aspect of it seems useful even apart from any attempt to adaptively set the trajectory length. They also introduce a method of adapting the stepsize during the burn-in period, so as to achieve some desired acceptance probability.

However, I don’t think these are completely satisfactory ways of setting trajectory lengths and stepsizes. As I’ll discuss below, these problems are more complicated than they might at first appear. (more…)

2012-01-21 at 12:38 am 8 comments

GRIMS — General R Interface for Markov Sampling

I have released a (very) preliminary version of my new MCMC software in R, which I’m calling GRIMS, for General R Interface for Markov Sampling. You can get it here.

This software differs from other more-or-less general MCMC packages in several respects, all but one of which make it, I think, a much better tool for serious MCMC applications. Here are some highlights: (more…)

2011-06-26 at 12:58 am 5 comments

Two textbooks on probability using R

This fall, I’ll be teaching a second-year course on Probability with Computer Applications, which is required for Computer Science majors.  I’ve taught this before, but that was five years ago, so I’ve been looking to see what new textbooks would be suitable.  The course aims not just to use computer science applications as examples, but also to reinforce concepts of probability with programs, and to show how simulation can be used to solve problems that aren’t easily solved analytically. I’ve used R for the programming part, and plan to again, so I was naturally interested in two recent textbooks that seemed to have similar aims:

Introduction to Probability with R, Kenneth Baclawski, Chapman & Hall / CRC.

Probability with R: An Introduction with Computer Science Applications, Jane M. Horgan, Wiley.

I’ve now had a look at both of these textbooks.  Unfortunately, they are both seriously flawed.  Even more unfortunately, although some of the flaws in these books are particularly striking, I’ve seen similar, if usually less serious, problems in many other textbooks. (more…)

2011-06-18 at 11:34 pm 20 comments

New patches to speed up R 2.13.0

I have now released a new collection of 30 patches to speed up R version 2.13.0. You can get them here

Assessing how much these patches speed up R is difficult. First of all, the speedup varies tremendously with the type of program. It also varies quite a bit with the machine and compiler used to run R. Finally, it varies in apparently random ways — changing code in one part of the R interpreter can change the speed of operations that never use the modified code by plus or minus 5% or more.  This is presumably due to the change altering the exact addresses of other code segments, with consequent effects on alignment of memory fetches or on cache behaviour.

Nevertheless, here is a comparison of R 2.13.0 without modification and with all my patches applied, with and without compilation of R functions. The tests were done with an Intel X5680 processor running at 3.33GHz in 64-bit mode using gcc 4.4.4 under Red Hat Linux with default R configuration parameters. The tests use my suite of speed tests for R.

Here are some highlights: (more…)

2011-06-09 at 3:03 pm 5 comments

Slowing down matrix multiplication in R

After I realized that some aspects of R’s implementation are rather inefficient, one of the first things I looked at was matrix multiplication.  There I found a huge performance penalty for many matrix multiplies, a penalty which remains in the current version, 2.13.0.  As discussed below, eliminating this penalty speeds up long vector dot products by a factor of 9.5 (on my new machine), and other operations where the result matrix has at least one small dimension are sped up by factors that are somewhat smaller, but still substantial. There’s a long story behind why R’s matrix multiplies are so slow… (more…)

2011-05-21 at 10:52 pm 11 comments

Speed tests for R — and a look at the compiler

I’ve gotten back to work on speeding up R, starting with improving my suite of speed tests.  Among other new features, this suite allows one to easily try out the “byte-code” compiler that is now a standard part of the latest release of R, version 2.13.0. You can get the suite here.

I’ve been running these tests on my new workstation, which has a six-core Intel X5680 processor, running at 3.33GHz.  Unfortunately, it’s clear that thing runs somewhat slower when you use all the cores at once, so for consistency one needs to do the speed tests using just one core.  (Or one needs some more elaborate, and unclear, protocol for testing the speed of R in a muticore environment.)  I haven’t figured out how to get Red Hat Linux to compile 32-bit applications yet, so all the tests are in a 64-bit environment.

I’ve started with comparing the speed of R-2.13.0 with and without functions being compiled, and with comparing R-2.13.0 (without the compiler) to R-2.11.1, which was the last release before some of my speed improvements were incorporated.  A plot of the results is here. (more…)

2011-05-13 at 9:16 am 9 comments

Older Posts


Calendar

June 2013
M T W T F S S
« Nov    
 12
3456789
10111213141516
17181920212223
24252627282930

Posts by Month

Posts by Category


Follow

Get every new post delivered to your Inbox.

Join 59 other followers