Here is my second post on the paper by Matthew Hoffman and Andrew Gelman on “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo”, available from arxiv.org. In my first post, I discussed how well the two main innovations in this “NUTS’” method — ending a trajectory when a “U-Turn” is encountered, and adaptively setting the stepsize — can be expected to work. In this post, I will discuss the empirical evaluations in the NUTS paper, and report on an evaluation of my own, made possible by the authors having kindly made available the NUTS software, concluding that the paper’s claims for NUTS are somewhat overstated. The issues I discuss are also of more general interest for other evaluations of HMC. (more…)
Matthew Hoffman and Andrew Gelman recently posted a paper called “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo” on arxiv.org. It has been discussed on Andrew’s blog.
It’s a good paper, which addresses two big barriers to wider use of Hamiltonian Monte Carlo — the difficulties of tuning the trajectory length and tuning the stepsize to use when simulating a trajectory. The name “No-U-Turn Sampler” (NUTS) comes from their way of addressing the problem of tuning the trajectory length — repeatedly double the length of the current trajectory, until (simplifying a bit) there is a part of the trajectory that makes a “U-Turn”, heading back towards its starting point. This doubling method is clever, and (as I discuss below) one aspect of it seems useful even apart from any attempt to adaptively set the trajectory length. They also introduce a method of adapting the stepsize during the burn-in period, so as to achieve some desired acceptance probability.
However, I don’t think these are completely satisfactory ways of setting trajectory lengths and stepsizes. As I’ll discuss below, these problems are more complicated than they might at first appear. (more…)
I have released a (very) preliminary version of my new MCMC software in R, which I’m calling GRIMS, for General R Interface for Markov Sampling. You can get it here.
This software differs from other more-or-less general MCMC packages in several respects, all but one of which make it, I think, a much better tool for serious MCMC applications. Here are some highlights: (more…)
In the June 23 print edition of the Globe and Mail (billed as “Canada’s National Newspaper”), there’s an article on data centres (“Hewers of wood, storers of data”), in which, on page B4, one can read the following:
Greenpeace recently released a report that said if the Internet were a country, it would be the fifth-largest consumer of energy, largely because of the massive data centres that run unseen in the background. The group estimated that the centres will use 1.9 billion kilowatt hours of electricity by 2020 — more than the amount currently used by Canada, France, Germany and Brazil combined. (The average US home uses 8,000 kilowatt hours a year.)
An exercise for the reader: How many logical fallacies, arithmetic errors, or contradictions of common knowledge can you find in this passage?
I haven’t tried to determine whether these fallacies originate in the (unidentified) Greenpeace report, or are original to the Globe and Mail.
This fall, I’ll be teaching a second-year course on Probability with Computer Applications, which is required for Computer Science majors. I’ve taught this before, but that was five years ago, so I’ve been looking to see what new textbooks would be suitable. The course aims not just to use computer science applications as examples, but also to reinforce concepts of probability with programs, and to show how simulation can be used to solve problems that aren’t easily solved analytically. I’ve used R for the programming part, and plan to again, so I was naturally interested in two recent textbooks that seemed to have similar aims:
Introduction to Probability with R, Kenneth Baclawski, Chapman & Hall / CRC.
Probability with R: An Introduction with Computer Science Applications, Jane M. Horgan, Wiley.
I’ve now had a look at both of these textbooks. Unfortunately, they are both seriously flawed. Even more unfortunately, although some of the flaws in these books are particularly striking, I’ve seen similar, if usually less serious, problems in many other textbooks. (more…)
I have now released a new collection of 30 patches to speed up R version 2.13.0. You can get them here
Assessing how much these patches speed up R is difficult. First of all, the speedup varies tremendously with the type of program. It also varies quite a bit with the machine and compiler used to run R. Finally, it varies in apparently random ways — changing code in one part of the R interpreter can change the speed of operations that never use the modified code by plus or minus 5% or more. This is presumably due to the change altering the exact addresses of other code segments, with consequent effects on alignment of memory fetches or on cache behaviour.
Nevertheless, here is a comparison of R 2.13.0 without modification and with all my patches applied, with and without compilation of R functions. The tests were done with an Intel X5680 processor running at 3.33GHz in 64-bit mode using gcc 4.4.4 under Red Hat Linux with default R configuration parameters. The tests use my suite of speed tests for R.
Here are some highlights: (more…)