New version of pqR with faster variable lookup, faster subset replacement, and more
|I’ve released a new version, pqR-2014-09-30, of my speedier, “pretty quick”, implementation of R, with some major performance improvements, and some features from recent R Core versions. It also has fixes for bugs (some also in R-3.1.1) and installation glitches.|
Details are in pqR NEWS. Here I’ll highlight some of the more interesting improvements.
Faster variable lookup. In both pqR and R Core implementations, variables local to a function are stored in a linked list, which is searched sequentially when looking for a variable (though this may sometimes be avoided in byte-compiled functions). So the more variables you have in your function, the slower it is to access or modify one of them. The new version of pqR often avoids this search by saving for each symbol the result from the last time that symbol was looked up in some local environment, and re-using this if the same environment is searched for that symbol again.
Re-using memory when updating variables. When variables are updated with statements like i <- i+1 or v <- exp(v) we would prefer that the variable be updated by modifying its stored value, without allocating a new object (provided this value isn’t shared with some other variable). This is now done in pqR for binary and unary arithmetic operators and for mathematical functions of one argument. Eliminating such unnecessary storage allocation is important both for scalar operands (eg, counters in while loops) and when the operands are vectors (possibly quite large).
Updating in place also produces more possibilities for task merging — for example, the two operations v <- 2*v; v <- v+1 will now be merged into a single loop over the elements of v that replaces each element by twice the element plus one.
Faster and better subset replacement operations. The interpreter’s handling of subset replacement operations such as a[i] <- 1, L$x <- y, L$x[i] <- 0, and diag(L$M)[i] <- 1 has been completely revised, substantially improving speed, and also fixing some long-standing problems with the previous scheme. I will discuss this important change in more detail in a later post.
Shared, read-only constants. The result of evaluating an expression may now sometimes be a shared constant, stored (on most platforms) in read-only memory. In addition to improving speed and reducing memory usage, this change will sometimes have the effect that buggy code in packages (or the interpreter itself) that fails to check whether an object is shared before modifying it will now result in a memory access fault, rather than silently producing an incorrect answer.
Faster and better-defined external function calls. The overhead of calling external functions with .C or .Fortran has been substantially reduced. Some improvements in .C and .Fortran were made in R-2.15.1; pqR now has these optimizations as well as others.
Furthermore, pqR now documents (in help(.C)) what expressions are guaranteed to return unshared objects that may safely be modified when the DUP=FALSE option is used to .C or .Fortran, and makes clear that DUP=FALSE should be used only to improve performance, not as a way of surreptitiously returning information to the caller without the caller referring to the list returned by .C or .Fortran. I will be writing more on the use of DUP=FALSE in a future post.
Under some circumstances, routines called via .C or .Fortran can now be done by a helper thread in parallel with other operations. This is done only if an argument of HELPER=TRUE is passed to .C or .Fortran, which should be done only when the routine performs a pure numerical computation without side effects.
The speed of .Call and .External has been improved slightly. More importantly, however, within a routine called by .Call or .External, LENGTH, TYPEOF, REAL, INTEGER, LOGICAL, RAW, COMPLEX, CAR, CDR, CADR, etc. are now macros or inline functions, avoiding possibly substantial procedure call overhead.
And more… Numerous other performance improvements are described in the NEWS file, which also describes other changes that improve compatibility with recent R Core releases, add a few new features, fix bugs, etc. Several changes have been made to make it easier to use fast BLAS routines for matrix multiplication and other matrix operations, which will be the topic of another post. I will also be posting soon about how the speed of pqR-2014-09-30 compares with earlier versions of pqR and with past and current R Core releases.