Array performance question


#1

Here’s a real Array performance mystery. I can’t help my friend figure
this one out. He writes,

========

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

only because I still can’t believe I’m not doing something incredibly
stupid. (I’m a newbie, “Reluctant Rubyist”)

I’m just not used to arrays behaving this way :slight_smile: Anyone have any
insight?

========

He’d love a response in his blog, but with permission I’ll copy
answers here over there. He’s not a regular ruby-talk reader…yet.

Jim


#2

2009/2/27 Jim M. removed_email_address@domain.invalid:

only because I still can’t believe I’m not doing something incredibly
stupid. Â (I’m a newbie, “Reluctant Rubyist”)

 I’m just not used to arrays behaving this way :slight_smile:  Anyone have any
insight?

========

He’d love a response in his blog, but with permission I’ll copy
answers here over there. He’s not a regular ruby-talk reader…yet.

Slightly modified test (attached):

15:44:06 Temp$ ruby bm.rb bm; ruby bm.rb bmbm; ruby19 bm.rb bm; ruby19
bm.rb bmbm
1.8.7
user system total
real
size = 10000 10000 0.125000 0.000000 0.125000 (
0.114000)
size = 100000 100000 0.859000 0.000000 0.859000 (
0.863000)
size = 1000000 1000000 24.969000 0.000000 24.969000 (
24.961000)
1.8.7
Rehearsal

size = 10000 10000 24.235000 0.032000 24.267000 (
24.308000)
size = 100000 100000 18.078000 0.000000 18.078000 (
18.150630)
size = 1000000 1000000 14.297000 0.000000 14.297000 (
14.289000)
------------------------------------------------------- total:
56.642000sec

                                user     system      total 

real
size = 10000 10000 12.031000 0.000000 12.031000 (
12.039000)
size = 100000 100000 10.250000 0.000000 10.250000 (
10.290950)
size = 1000000 1000000 8.985000 0.000000 8.985000 (
8.989000)
1.9.1
user system total
real
size = 10000 10000 0.016000 0.000000 0.016000 (
0.027000)
size = 100000 100000 0.031000 0.000000 0.031000 (
0.028000)
size = 1000000 1000000 0.047000 0.000000 0.047000 (
0.043000)
1.9.1
Rehearsal

size = 10000 10000 0.032000 0.000000 0.032000 (
0.043000)
size = 100000 100000 0.032000 0.000000 0.032000 (
0.043000)
size = 1000000 1000000 0.031000 0.000000 0.031000 (
0.043000)
-------------------------------------------------------- total:
0.095000sec

                                user     system      total 

real
size = 10000 10000 0.031000 0.000000 0.031000 (
0.039000)
size = 100000 100000 0.047000 0.000000 0.047000 (
0.043000)
size = 1000000 1000000 0.032000 0.000000 0.032000 (
0.043000)

Strange that the rehearsal has this dramatic effect. Normally it
should simply execute the block twice to warm up the system.

Kind regards

robert


#3

2009/2/27 Robert K. removed_email_address@domain.invalid:

Strange that the rehearsal has this dramatic effect. Â Normally it
should simply execute the block twice to warm up the system.

Here’s a better test

16:16:32 Temp$ ruby bm.rb; ruby19 bm.rb
1.8.7
user system total
real
size = 10000 10000 0.110000 0.000000 0.110000 (
0.118000)
size = 100000 100000 0.781000 0.000000 0.781000 (
0.769000)
size = 1000000 1000000 25.172000 0.000000 25.172000 (
25.185000)
size = 10000 10000 0.328000 0.000000 0.328000 (
0.329000)
size = 100000 100000 0.547000 0.000000 0.547000 (
0.542000)
size = 1000000 1000000 18.156000 0.000000 18.156000 (
18.151000)
1.9.1
user system total
real
size = 10000 10000 0.031000 0.000000 0.031000 (
0.027000)
size = 100000 100000 0.031000 0.000000 0.031000 (
0.028000)
size = 1000000 1000000 0.031000 0.000000 0.031000 (
0.043000)
size = 10000 10000 0.031000 0.000000 0.031000 (
0.031000)
size = 100000 100000 0.032000 0.000000 0.032000 (
0.023000)
size = 1000000 1000000 0.031000 0.000000 0.031000 (
0.043000)

“better” in the sense that we’re not suffering from bmbm’s magic.

Kind regards

robert


#4

Jim M. wrote:

Jim

This variation exhibits the same behavior.

require ‘benchmark’

class BufferContainer

def initialize( initial_data )
@buf = initial_data
end

def put_array( array )
@buf[0, array.size] = array
end
end

size = 999
the_array = (0…4).to_a

(1…3).each{|i|
size *= 10
a = [nil] * size

RUN WITH THIS COMMENTED OUT FIRST

a << 9

puts “size = #{ size } #{ a.size }”
puts Benchmark.measure {
10_000.times {
buf = BufferContainer.new( a )
buf.put_array( the_array )
}
}

}


#5

On Feb 27, 2009, at 05:16 , Jim M. wrote:

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

I don’t have time to dig into this, but I believe he’s hitting a GC
threshold by going over N objects.


#6

On Feb 28, 2009, at 21:32 , Ryan D. wrote:

threshold by going over N objects.
Poke around gc.c and see what you can figure out. Also, look at the
1.9 version or through the changelog to find hints…

#ifndef GC_MALLOC_LIMIT
#if defined(MSDOS) || defined(human68k)
#define GC_MALLOC_LIMIT 200000
#else
#define GC_MALLOC_LIMIT 8000000
#endif
#endif


#7

Jim M. wrote:

Jim

I am betting Ryan is right about GC issues. JRuby does not fall down
doing this either (which probably also backs up the GC theory):

ruby ~/jruby/scripts/arr_ben.rb
size = 10000 10000
0.050000 0.000000 0.050000 ( 0.057190)
size = 100000 100000
0.660000 0.000000 0.660000 ( 0.672178)
size = 1000000 1000000
34.430000 0.340000 34.770000 ( 35.562454)

jruby --server ~/jruby/scripts/arr_ben.rb
size = 10000 10000
0.372000 0.000000 0.372000 ( 0.266000)
size = 100000 100000
0.099000 0.000000 0.099000 ( 0.099000)
size = 1000000 1000000
0.154000 0.000000 0.154000 ( 0.154000)

-Tom