## Archive for June, 2011

### GRIMS — General R Interface for Markov Sampling

I have released a (very) preliminary version of my new MCMC software in R, which I’m calling GRIMS, for General R Interface for Markov Sampling. You can get it here.

This software differs from other more-or-less general MCMC packages in several respects, all but one of which make it, I think, a much better tool for serious MCMC applications. Here are some highlights: (more…)

### Innumeracy at the Globe and Mail

In the June 23 print edition of the Globe and Mail (billed as “Canada’s National Newspaper”), there’s an article on data centres (“Hewers of wood, storers of data”), in which, on page B4, one can read the following:

Greenpeace recently released a report that said if the Internet were a country, it would be the fifth-largest consumer of energy, largely because of the massive data centres that run unseen in the background. The group estimated that the centres will use 1.9 billion kilowatt hours of electricity by 2020 — more than the amount currently used by Canada, France, Germany and Brazil combined. (The average US home uses 8,000 kilowatt hours a year.)

An exercise for the reader: How many logical fallacies, arithmetic errors, or contradictions of common knowledge can you find in this passage?

I haven’t tried to determine whether these fallacies originate in the (unidentified) Greenpeace report, or are original to the Globe and Mail.

### Two textbooks on probability using R

This fall, I’ll be teaching a second-year course on Probability with Computer Applications, which is required for Computer Science majors. I’ve taught this before, but that was five years ago, so I’ve been looking to see what new textbooks would be suitable. The course aims not just to use computer science applications as examples, but also to reinforce concepts of probability with programs, and to show how simulation can be used to solve problems that aren’t easily solved analytically. I’ve used R for the programming part, and plan to again, so I was naturally interested in two recent textbooks that seemed to have similar aims:

*Introduction to Probability with R*, Kenneth Baclawski, Chapman & Hall / CRC.

*Probability with R: An Introduction with Computer Science Applications*, Jane M. Horgan, Wiley.

I’ve now had a look at both of these textbooks. Unfortunately, they are both seriously flawed. Even more unfortunately, although some of the flaws in these books are particularly striking, I’ve seen similar, if usually less serious, problems in many other textbooks. (more…)

### New patches to speed up R 2.13.0

I have now released a new collection of 30 patches to speed up R version 2.13.0. You can get them here

Assessing how much these patches speed up R is difficult. First of all, the speedup varies tremendously with the type of program. It also varies quite a bit with the machine and compiler used to run R. Finally, it varies in apparently random ways — changing code in one part of the R interpreter can change the speed of operations that never use the modified code by plus or minus 5% or more. This is presumably due to the change altering the exact addresses of other code segments, with consequent effects on alignment of memory fetches or on cache behaviour.

Nevertheless, here is a comparison of R 2.13.0 without modification and with all my patches applied, with and without compilation of R functions. The tests were done with an Intel X5680 processor running at 3.33GHz in 64-bit mode using gcc 4.4.4 under Red Hat Linux with default R configuration parameters. The tests use my suite of speed tests for R.

Here are some highlights: (more…)