http://econ.ucsb.edu/~doug/researchpapers/Testing%20for%20Regime%20Switching%20A%20Comment.pdf

]]>For a vector of 100 million elements, replacing one element with no copying required will indeed be much, much faster than when a copy is required. However, it’s still quite possible for a sequence of updates of single elements with no copy done to take a long, long time. You’ll see that a loop doing updates one at a time to all 100 million elements of x will be slower than a copy (by a lot). Whether such updates of single elements (or small parts) are an important portion of the total computation time for your program will, of course, depend on your program…

]]>Second the amount of time taken compared to when I force an actual copy of an entire long vector is miniscule. So in terms of time taken, it’s either not making a copy or is somehow implemented so effectively that the time is essentially none…

Here’s the address not changing:

x = rnorm(5)

.Internal(inspect(x))

# @28ae698 14 REALSXP g0c4 [NAM(1)] (len=5, tl=0) 0.394722,0.212128,1.58906,0.0491079,-0.817408

x[2:3] = rnorm(2)

.Internal(inspect(x))

# @28ae698 14 REALSXP g0c4 [NAM(1)] (len=5, tl=0) 0.394722,0.794627,0.791023,0.0491079,-0.817408

and here’s the stark example of how little time the replacement takes, compared to when I force a copy of the entire vector to be made, where the actual copy is delayed because of R’s copy-on-change semantics:

x = rnorm(1e8)

system.time({x[1] = 3})

# user system elapsed

# 0 0 0

system.time({y = x})

# user system elapsed

# 0 0 0

system.time({y[1] = 3})

# user system elapsed

# 0.160 0.172 0.332

Guessing at what you might have meant for (b) at least, the code I show involving *tmp* is an R-level picture of what the R Core interpeter more-or-less does, but the interpreter does do it faster than it would be done at the R level. In particular, at the R level, *tmp* will be treated as an ordinary variable that has to appear different from x, whereas in the interpreter it isn’t, so there’s not as much copying done.

]]>@28ae6f0 14 REALSXP g0c4 [NAM(1)] (len=5, tl=0) 0.88136,-0.322485,-0.308981,-1.41978,-1.25584

> x[2:3] .Internal(inspect(x))

@28ae6f0 14 REALSXP g0c4 [NAM(1)] (len=5, tl=0) 0.88136,1.95407,0.223208,-1.41978,-1.25584

b) Furthermore, the standard syntax uses very little time compared to running your example code:

> x = rnorm(1e7)

> system.time(x[10000:10001] system.time({`*tmp*` <- x; x <- `[<-`(`*tmp*`, 10000:10001, value=rnorm(2)); rm(`*tmp*`)})

user system elapsed

0.020 0.028 0.047

Any thoughts?

]]>Jascha Sohl-Dickstein, Mayur Mudigonda, and Michael R. DeWeese

Hamiltonian Monte Carlo Without Detailed Balance.

International Conference on Machine Learning (2014)