## Posts filed under ‘Computing’

### Fixing R’s design flaws in a new version of pqR

In particular, the extensions fix the problems that `1:n` doesn’t work as intended when `n` is zero, and that `M[1:n,]` is a vector rather than a matrix when `n` is one, or when `M` has only one column. Since changing the “`:`” operator would cause too many problems with existing programs, pqR introduces a new “`..`” operator for generating increasing sequences. Unwanted dimension dropping is also addressed in ways that have minimal effects on existing code.

The new release, pqR-2016-06-24, is available at pqR-project-org. The NEWS file for this release also documents some other language extensions, as well as fixes for various bugs (some of which are also in R-3.3.1).

(more…)

### Exact computation of sums and means

A while ago, I came across a mention of the Python math.fsum function, which sums a set of floating-point values *exactly*, then rounds to the closest floating point value. This seemed useful. In particular, I thought that if it’s fast enough it could be used instead of R’s rather primitive two-pass approach to trying to compute the sample mean more accurately (but still not exactly). My initial thought was to just implement the algorithm Python uses in pqR. But I soon discovered that there were newer (and faster) algorithms. And then I thought that I might be able to do even better…

The result is a new paper of mine on Fast exact summation using small and large superaccumulators (also available from arxiv.org).

### How large vectors in R might be stored compactly

Vectors in R can currently have elements of two sizes — 8-byte double-precision floating-point elements for `numeric’ vectors, or 4-byte elements for `integer’ or `logical’ vectors. You can also have vectors whose elements are 1-byte `raw’ values, but these raw vectors don’t support negative numbers, or NA values, so they aren’t suitable for general use.

It seems that lots of actual data vectors could be stored more compactly than at present. Many integer vectors consist solely of elements that would fit in one or two bytes. Logical vectors could be stored using two bits per element (allowing TRUE, FALSE, and NA), which would use only one-sixteenth as much memory as at present. It’s likely that many operations would also be faster on such compact vectors, so there’s not even necessarily a time-space tradeoff.

For integer and logical types, the possible compact representations, and how to work with them, are fairly obvious. The challenge is how to start using such compact representations while retaining compatibility with existing R code, including functions written in C, Fortran, or whatever. Of course, one could use the S3 or S4 class facilities to define new classes for data stored compactly, with suitable redefinitions of standard operators such as `+’, but this would have substantial overhead, and would in any case not completely duplicate the behaviour of non-compact numeric, integer, or logical vectors. Below, I discuss how to implement compact representations in a way that is completely invisible to R programs. I hope to try this out in my pqR implementation of R sometime, though other improvements to pqR have higher priority at the moment.

How to compactly represent floating-point data (of R’s `numeric’ type) is not so obvious. If the use of a compact representation is to have no effect on the results, one cannot just use single-precision floating point. I describe a different approach in a new paper on Representing numeric data in 32 bits while preserving 64-bit precision (also on arxiv). I’ll present the idea of this paper next, before returning to the question of how one might put compact representations of any sort into an R interpreter, invisibly to R programs. (more…)

### Subset replacement in pqR: Now faster and better

The latest version of pqR that I just released uses a new way of implementing subset replacement operations — such as a[i]<-1 or L$M[1:100,i]<-v. The new approach is much faster, and eliminates some strange behaviour of the previous approach. |

This change affects only interpreted code. The bytecode compiler (available since R-2.13.0) introduced a different mechanism, which is also faster than the previous approach used by the interpreter (though it still has some of the strange behaviour). This faster mechanism was one of the main reasons for byte-compiled code to be faster than interpreted code (although it would have been possible to use the new mechanism in the interpreter as well). With pqR’s new implementation of subset replacement, this advantage of byte-compiled over interpreted code is much reduced.

In addition to being faster, pqR’s new approach is also more coherent than the previous approach (still current in the interpreter for R Core releases to at least R-3.1.1), which despite its gross inefficiency and confused semantics has remained essentially unchanged for 18 years. Unfortunately, the new approach in pqR is not as coherent as it might be, because past confusion has resulted in some packages doing “wrong” things, which have to be accommodated, as least in the short term.

### New version of pqR with faster variable lookup, faster subset replacement, and more

I’ve released a new version, pqR-2014-09-30, of my speedier, “pretty quick”, implementation of R, with some major performance improvements, and some features from recent R Core versions. It also has fixes for bugs (some also in R-3.1.1) and installation glitches. |

Details are in pqR NEWS. Here I’ll highlight some of the more interesting improvements.

(more…)

### New release of pqR, with a curated repository

I have released a new version, pqR-2014-06-19, of my speedier, “pretty quick”, implementation of R. This and the previous release (pqR-2014-02-23) are maintenance releases, with bug fixes, improved documentation, and better test procedures. |

The result is that pqR now works with a large collection of 3438 packages.

### Inaccurate results from microbenchmark

The microbenchmark package is a popular way of comparing the time it takes to evaluate different R expressions — perhaps more popular than the alternative of just using system.time to see how long it takes to execute a loop that evaluates an expression many times. Unfortunately, when used in the usual way, microbenchmark can give inaccurate results.

The inaccuracy of microbenchmark has two main sources — first, it does not correctly allocate the time for garbage collection to the expression that is responsible for it, and second, its summarizes the results by the *median* time for many repetitions, when the *mean* is what is needed. The median and mean can differ drastically, because just a few of the repetitions will include time for a garbage collection. These flaws can result in comparisons being reversed, with the expression that is actually faster looking slower in the output of microbenchmark. (more…)