Why is allocated array slow?

Sadaf_N · January 22, 2017, 1:57am

I compared speed of building an array by allocating with my desired size
and filling the array with a loop with defining an empty array and
filling it in a loop with the << operator. Much to my surprise the push
technique was faster with arrays of 10,000, 100,000, and 1,000,000
entries. WTF?

a = Array.new(size=1000000)
puts(“Using #{a.size} entry array”)
t = Time.new.usec
(1…1000000).each do |i|
a[i] = i
end
t2 = Time.new.usec
puts(“Loading array took #{t2 - t} microseconds”)

a = []
puts(“Using #{a.size} entry array”)
t = Time.new.usec
(1…1000000).each do |i|
a << i
end
t2 = Time.new.usec
puts(“Loading array took #{t2 - t} microseconds”)

charles_parker · January 23, 2017, 2:16pm

You don’t correct the times correctly.

Time.new.usec does not return the current time in microseconds, but
the fractional “microseconds” part of the current time. For example, if
the the number of seconds in epoch would be - measured to microseconds
precision - 1485177068.123456, Time.now.to_i would be 1485177068 and
Time.now.usec would be 123456.

Hence, calculating the difference between two values returned by the
usec method, doesn’t make sense.

For getting the time difference rounded to Microseconds, do something
like

t = Time.now
t2 = Time.now
diff = ((t2-t)*1000000).to_i

charles_parker · January 24, 2017, 7:42am

I see. Just a guess:

I don’t think that reallocation of the array really costs a lot, so the
difference is mainly between calculating the index every time, and
pushing something to the end of the array, i.e. treating the array as a
stack.

I could well imagine, that a push operation to a stack comes somewhat
less expensive than replacing an existing element in the middle.

However, this depends on the implementation. I tried it on Windows with
MRI Ruby 2.2.4 (Cygwin Version), and the preallocated array took
70000-77000 microseconds, while the appending version took 84000-89000
microseconds, i.e. slightly slower.

Then I did the benchmark with Jruby 1.7.24 and got 260000-274000
Microseconds vs. 149000-150000, which corresponds to your observation.

Finally, I used Jruby 9.0.4.0 and got 351000-437000 vs. 721000-813000,
so the appending version here is MUCH more expensive. Actually,the
increase in time for this in the JRuby version is what surprises me
more, because I would expect JRuby 9 being faster than JRuby 1.7.

charles_parker · January 23, 2017, 7:08pm

Ronald,

Thanx; I should have read the doc with more cars. Unfortunately, this
doesn’t explain what I’m seeing. Loading the pre-allocated array is
still consistently slower than pushing into an empty array. Here’s what
the test code is now:

def time_diff(t2, t1)
((t2-t1)*1000000).to_i
end

a = Array.new(size=1000000)
puts(“Using #{a.size} entry array”)
t1 = Time.now
(1…1000000).each do |i|
a[i] = i
end
t2 = Time.now
puts(“Loading array took #{time_diff(t2, t1)} microseconds”)

a = []
puts(“Using #{a.size} entry array”)
t1 = Time.now
(1…1000000).each do |i|
a << i
end
t2 = Time.now
puts(“Loading array took #{time_diff(t2, t1)} microseconds”)

and here’s some sample output from several runs:

==> ruby tryitx.rb
Using 1000000 entry array
Loading array took 146741 microseconds
Using 0 entry array
Loading array took 124513 microseconds
==> ruby tryitx.rb
Using 1000000 entry array
Loading array took 148948 microseconds
Using 0 entry array
Loading array took 123989 microseconds
==> ruby tryitx.rb
Using 1000000 entry array
Loading array took 143825 microseconds
Using 0 entry array
Loading array took 123934 microseconds

charles_parker · January 24, 2017, 8:57am

Ah - the old “that depends” trick:)

Thanx Ronald. Great research.

Oh well … Charlie