How pqR makes programs faster by not doing things
One way my faster version of R, called pqR (see updated release of 2013-06-28), can speed up R programs is by not even doing some operations. This happens in statements like
for (i in 1:1000000) ..., in subscripting expressions like
v[i:1000], and in logical expressions like
This is done using pqR’s internal “variant result” mechanism, which is also crucial to how helper threads are implemented. This mechanism is not visible to the user, apart from the reductions in run time and memory usage, but knowing about it will make it easier to understand the performance of programs running under pqR.
One application of variant results is to allow expressions like
seq_len(k) to not actually generate a sequence when the result will be used in certain contexts, such as the sequence of values in a for statement, or the subscript of a vector. The way this works is that when (for example) the interpreter evaluates the expression in a for statement after the “in”, it does so by calling the “eval” function with an extra argument indicating that a “variant result” is requested, and in particular that a VARIANT_SEQ result is desired. The code that evaluates this expression may or may not pay attention to this request — the code implementing the for statement must be able to handle both this variant result and an ordinary result. The primitive operator that evaluates expressions such as
i:n can return a VARIANT_SEQ result when asked to, provided the result is an increasing sequence of integers. The VARIANT_SEQ result is a special object that contains just the start and end of the sequence. When the code implementing the for statement receives this, it just repeats the loop body with the loop variable set to each integer in this sequence, without the sequence ever being stored in memory.
Similarly, when the vector subscripting code evaluates a subscript, it too asks for a VARIANT_SEQ result. If it gets one, it just copies out the indicated portion of the vector. Time and memory are saved by not creating the sequence of indexes, and time is also saved because the subscripting code knows that the subscript identifies a contiguous block. This is also done for the first subscript of a matrix, but at present not for subscripts in other contexts, or for subscripts in an assignment to part of a vector or matrix.
any primitive asks for a VARIANT_OR result when evaluating its argument. Relational operators and some other functions such as
is.na pay attention to this request, and return the logical “or” of the vector result, rather than the vector itself, which they can sometimes do without actually computing the whole vector. For example,
any(v>0) will result in
v being scanned until the first positive element is found, or the end of
v is reached. Analogously, the
all primitive asks for a VARIANT_AND result. The
sum primitive asks for a VARIANT_SUM result, which currently only the mathematical functions of one argument (eg, exp) will deliver. Use of VARIANT_SUM avoids allocating storage for the vector that is summed, and speeds up the computation a bit, but the whole vector has to still be computed so that all elements can be summed.
The facility for doing numerical computations in helper threads depends crucially on the use of VARIANT_PENDING_OK. When an expression is evaluated with this variant request, the value returned may still be in the process of being computed. The code that requested this variant must of course be prepared to deal with a value that hasn’t been computed yet. Code that can’t deal with a pending value, such as any code written before the advent of helper threads, will not request evaluation with VARIANT_PENDING_OK, and so will see only values that have been fully computed.
A request for a variant result from evaluation of an if statement gets passed on to the evaluation of whichever branch is chosen, and similarly a variant requested when evaluating a function call gets passed on to evaluation of the body of the function. The assignment operator uses VARIANT_PENDING_OK when evaluating the right-hand side of a assignment, and for a simple assignment will store a pending result in the variable. When a variable containing a value that hasn’t been computed yet is evaluated, the pending value will be returned without delay if the variable was evaluated with VARIANT_PENDING_OK, but if not, the master thread will at that point have to wait for the computation to finish (doing it itself if no helper thread has started to do it).
The variant result facility is currently not used by byte-compiled code (i.e., compiled code will not request a variant result, and will not return one if asked). Because of this, compiling a function in pqR will sometimes slow it down rather than speed it up. It might be possible to implement variant results in compiled code. Alternatively, if interpretive overhead can be reduced further, one could abandon the compiler without regret.