Forum: Ruby Why is allocated array slow?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
8685a433738d870899ca99e698eb489c?d=identicon&s=25 Charles P. (charles_p)
on 2017-01-22 01:57
I compared speed of building an array by allocating with my desired size
and filling the array with a loop with defining an empty array and
filling it in a loop with the << operator. Much to my surprise the push
technique was faster with arrays of 10,000, 100,000, and 1,000,000
entries. WTF?

a = Array.new(size=1000000)
puts("Using #{a.size} entry array")
t = Time.new.usec
(1..1000000).each do |i|
  a[i] = i
end
t2 = Time.new.usec
puts("Loading array took #{t2 - t} microseconds")

a = []
puts("Using #{a.size} entry array")
t = Time.new.usec
(1..1000000).each do |i|
  a << i
end
t2 = Time.new.usec
puts("Loading array took #{t2 - t} microseconds")
0fa73332c8e4a3b06ea439fd3f034322?d=identicon&s=25 Ronald Fischer (rovf)
on 2017-01-23 14:16
You don't correct the times correctly.

Time.new.usec does *not* return the current time in microseconds, but
the fractional "microseconds" part of the current time. For example, if
the the number of seconds in epoch would be - measured to microseconds
precision - 1485177068.123456, Time.now.to_i would be 1485177068 and
Time.now.usec would be 123456.

Hence, calculating the difference between two values returned by the
usec method, doesn't make sense.

For getting the time difference rounded to Microseconds, do something
like

    t = Time.now
    t2 = Time.now
    diff = ((t2-t)*1000000).to_i
8685a433738d870899ca99e698eb489c?d=identicon&s=25 Charles P. (charles_p)
on 2017-01-23 19:08
Ronald,

Thanx; I should have read the doc with more cars. Unfortunately, this
doesn't explain what I'm seeing. Loading the pre-allocated array is
still consistently slower than pushing into an empty array. Here's what
the test code is now:

def time_diff(t2, t1)
  ((t2-t1)*1000000).to_i
end

a = Array.new(size=1000000)
puts("Using #{a.size} entry array")
t1 = Time.now
(1..1000000).each do |i|
  a[i] = i
end
t2 = Time.now
puts("Loading array took #{time_diff(t2, t1)} microseconds")

a = []
puts("Using #{a.size} entry array")
t1 = Time.now
(1..1000000).each do |i|
  a << i
end
t2 = Time.now
puts("Loading array took #{time_diff(t2, t1)} microseconds")

and here's some sample output from several runs:

==> ruby tryitx.rb
Using 1000000 entry array
Loading array took 146741 microseconds
Using 0 entry array
Loading array took 124513 microseconds
==> ruby tryitx.rb
Using 1000000 entry array
Loading array took 148948 microseconds
Using 0 entry array
Loading array took 123989 microseconds
==> ruby tryitx.rb
Using 1000000 entry array
Loading array took 143825 microseconds
Using 0 entry array
Loading array took 123934 microseconds
0fa73332c8e4a3b06ea439fd3f034322?d=identicon&s=25 Ronald Fischer (rovf)
on 2017-01-24 07:42
I see. Just a guess:

I don't think that reallocation of the array really costs a lot, so the
difference is mainly between calculating the index every time, and
pushing something to the end of the array, i.e. treating the array as a
stack.

I could well imagine, that a push operation to a stack comes somewhat
less expensive than replacing an existing element in the middle.

However, this depends on the implementation. I tried it on Windows with
MRI Ruby 2.2.4 (Cygwin Version), and the preallocated array took
70000-77000 microseconds, while the appending version took 84000-89000
microseconds, i.e. slightly slower.

Then I did the benchmark with Jruby 1.7.24 and got 260000-274000
Microseconds vs. 149000-150000, which corresponds to your observation.

Finally, I used Jruby 9.0.4.0 and got 351000-437000 vs. 721000-813000,
so the appending version here is MUCH more expensive. Actually,the
increase in time for this in the JRuby version is what surprises *me*
more, because I would expect JRuby 9 being faster than JRuby 1.7.
8685a433738d870899ca99e698eb489c?d=identicon&s=25 Charles P. (charles_p)
on 2017-01-24 08:57
Ah - the old "that depends" trick:)

Thanx Ronald. Great research.

Oh well ... Charlie
This topic is locked and can not be replied to.