What is faster in Ruby, `arr += [x]` or `arr << x`

Intuitively, the latter should be faster than the former. However, I was very surprised when I saw benchmark results:

  require 'benchmark/ips'

  b = (0..20).to_a;
  y = 21;
  Benchmark.ips do |x|
    x.report('<<')   { a = b.dup; a << y }
    x.report('+=')   { a = b.dup; a += [y] }
    x.report('push') { a = b.dup; a.push(y) }
    x.report('[]=')  { a = b.dup; a[a.size]=y }
    x.compare!
  end

The result is:

Calculating -------------------------------------
                  <<    24.978k i/100ms
                  +=    30.389k i/100ms
                push    24.858k i/100ms
                 []=    22.306k i/100ms
-------------------------------------------------
                  <<    493.125k (± 3.2%) i/s -      2.473M
                  +=    599.830k (± 2.3%) i/s -      3.009M
                push    476.374k (± 3.3%) i/s -      2.386M
                 []=    470.263k (± 3.8%) i/s -      2.364M

Comparison:
                  +=:   599830.3 i/s
                  <<:   493125.2 i/s - 1.22x slower
                push:   476374.0 i/s - 1.26x slower
                 []=:   470262.8 i/s - 1.28x slower

However, when a colleague of mine independently created his own benchmark, the result was quite the opposite:

 Benchmark.ips do |x|
   x.report('push') {@a = (0..20).to_a; @a.push(21)}
   x.report('<<')   {@b = (0..20).to_a; @b << 21}
   x.report('+=')   {@c = (0..20).to_a; @c += [21]}
   x.compare!
 end

Result:

Calculating -------------------------------------
                push    17.623k i/100ms
                  <<    18.926k i/100ms
                  +=    16.079k i/100ms
-------------------------------------------------
                push    281.476k (± 4.2%) i/s -      1.410M
                  <<    288.341k (± 3.6%) i/s -      1.457M
                  +=    219.774k (± 8.3%) i/s -      1.093M

Comparison:
                  <<:   288341.4 i/s
                push:   281476.3 i/s - 1.02x slower
                  +=:   219774.1 i/s - 1.31x slower

We also cross-ran our benchmarks and on both our machines his benchmark showed that += is noticeably slower than <<, and mine showed the opposite.

Why is that?

UPD: my Ruby version is Ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-darwin14]; my colleague's is 2.2.2 (Don't know full details, will update the post tomorrow).

UPD2: ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin12.0] my teammate's Ruby version.

Answers


I believe this is down to how MRI allocates arrays (all of this answer is very MRI specific). Ruby tries really hard to be efficient with arrays: small arrays (<= 3 elements) are packed right into the RARRAY structure for example.

Another thing is that if you take an array and start appending values one at a time, ruby doesn't grow the buffer one element at a time, it does it in chunks: this is more efficient, at the expense of a small amount of memory.

One tool to see all this is using memsize_of:

ObjectSpace.memspace_of([]) #=> 40 (on 64 bit platforms
ObjectSpace.memspace_of([1,2]) #=> 40 (on 64 bit platforms
ObjectSpace.memsize_of([1,2,3,4]) #=> 72
ObjectSpace.memsize_of([]<<1<<2<<3<<4) #=> 200

In the first 2 cases, the array is packed within the RARRAY structure so the memory size is just the base size of any object (40 bytes). In the third case ruby had to allocate a array for 4 values (8 bytes each) so the size is 40 + 32 = 72. In the last case ruby grew the storage to 20 elements

This is relevant to the second case. The block inside the benchmark has a freshly created array which still has some spare capacity:

 ObjectSpace.memsize_of((0..20).to_a) #=> 336, enough for nearly 40 elements.

<< can just write its object to the appropriate slot, whereas += has to allocate a new array (both the object and its buffer) and copy all the data.

If I do

a = [1,2,3,4,5]
b  = a.dup
ObjectSpace.memsize_of(b) #=> 40

Here b shares its buffer with a, so is reported as using no memory beyond the basic object size. At the point where b gets written to, ruby will have to copy the data (copy on write): in the first benchmark BOTH += and << are actually allocating a new buffer of sufficient size and copying all the data across.

Here is where I get handwavy: this would completely explain things if << and += performed identically, but that's not what is happening. My reading of things is that + is simpler. All it has to do, no matter what is allocate a buffer, and memcpy some data from 2 locations - this is fast.

<< on the other hand is mutating the array so it is paying the overhead of copy on write: it is doing extra work compare to +=. For example ruby needs to track who is sharing buffers so that garbage collection of the original array is possible when no one is sharing it anymore.

A benchmark that goes someway to convincing me that this interpretation is correct is as follows:

require 'benchmark/ips'
b = (0..20).to_a.dup
y = 21
Benchmark.ips do |x|
  x.report('<<')   { a = b.dup; a << y }
  x.report('+=')   { a = b.dup; a += [y] }

  x.report('<<2')   { a = b.dup; a << y; a<< y}
  x.report('+=2')   { a = b.dup; a += [y]; a += [y] }
end

This is basically the same benchmark as the original, but now appending 2 elements. For << the copy on write overhead will only be incurred the first time. The results I get are

              <<      1.325M (± 7.6%) i/s -      6.639M
              +=      1.742M (± 9.5%) i/s -      8.677M
             <<2      1.230M (±10.3%) i/s -      6.079M
             +=2      1.140M (±10.8%) i/s -      5.656M

So appending to the array is back on top if you do it twice.


In my view, to simplify the comparison of various operators, we should remove the unnecessary code and keep the test simple.

require 'benchmark/ips'

y = 10
Benchmark.ips do |x|
    x.report('<<')   { a = [0,1,2,3,4,5,6,7,8,9]; a << y }
    x.report('+=')   { a = [0,1,2,3,4,5,6,7,8,9]; a += [y] }
    x.report('push') { a = [0,1,2,3,4,5,6,7,8,9]; a.push(y) }
    x.report('[]=')  { a = [0,1,2,3,4,5,6,7,8,9]; a[a.size]=y }
    x.compare!
end

The result of the above code is in line with the second code snippet shared in the question.

Calculating -------------------------------------
                  <<   101.735k i/100ms
                  +=   104.804k i/100ms
                push    92.863k i/100ms
                 []=    99.604k i/100ms
-------------------------------------------------
                  <<      2.134M (± 3.3%) i/s -     10.682M
                  +=      1.786M (±13.2%) i/s -      8.804M
                push      1.930M (±16.1%) i/s -      9.472M
                 []=      1.948M (± 7.9%) i/s -      9.761M

Comparison:
                  <<:  2134005.4 i/s
                 []=:  1948256.8 i/s - 1.10x slower
                push:  1930165.3 i/s - 1.11x slower
                  +=:  1785808.5 i/s - 1.19x slower

[Finished in 28.3s]

Why << is faster than +=?

Array#<< is fastest out of the four ways of appending an element to the array because it does just that - appends an element to the array. On the contrary, Array#+ appends an element but returns a new copy of the array - creation of new copy of array makes it slowest. (One can use toogle code option in documentation to understand additional work done by some methods)

Bench marking with dup

If we use below code for bench marking,

require 'benchmark/ips'

y = 10
Benchmark.ips do |x|
    x.report('<<')   { a = [0,1,2,3,4,5,6,7,8,9].dup; a << y }
    x.report('+=')   { a = [0,1,2,3,4,5,6,7,8,9].dup; a += [y] }
    x.report('push') { a = [0,1,2,3,4,5,6,7,8,9].dup; a.push(y) }
    x.report('[]=')  { a = [0,1,2,3,4,5,6,7,8,9].dup; a[a.size]=y }
    x.compare!
end

We see below results:

Calculating -------------------------------------
                  <<    65.225k i/100ms
                  +=    76.106k i/100ms
                push    64.864k i/100ms
                 []=    63.582k i/100ms
-------------------------------------------------
                  <<      1.221M (±14.3%) i/s -      6.001M
                  +=      1.291M (±13.1%) i/s -      6.393M
                push      1.164M (±14.1%) i/s -      5.773M
                 []=      1.168M (±14.5%) i/s -      5.722M

Comparison:
                  +=:  1290970.6 i/s
                  <<:  1221029.0 i/s - 1.06x slower
                 []=:  1168219.3 i/s - 1.11x slower
                push:  1163965.9 i/s - 1.11x slower

[Finished in 28.3s]

If we look carefully between two results, we see only one difference. The += entry has become first, whereas the order of rest of the methods remained same with the original result.

Why results flip when dup is used?

This is my wild guess, I am guessing that Ruby interpreter optimized the code and did not create a new array as part of += as it knew that it is working on freshly created copy of array by dup


Need Your Help

Android Studio - Illegal character 8204 error

java android android-intent android-fragments android-studio

For some reason, after rebuilding my project I get an illegal character error but nothing within my code gets underlined in red. Can someone please tell me what is wrong and how to resolve it?

Monitoring Keynote 6 presentation using ScriptingBridge

objective-c macos cocoa scripting-bridge keynote

I've developed an application which gathers some information on presentations and monitors the behaviour of the presenter during the presentation. The applications is an OS X app build in Objective-C