Dear list,
some days ago I wrote a script that stored values in an Hashtable with
different buckets. It was some string-parsing grunt work.
I noticed that Ruby became pretty slow when the hashtable grew to ~10
Mio
entries. To make sure it is actually Ruby that is to blame, I wrote
comparison
script snippets in Perl and Python, and yes, Ruby was just plain slower.
By reducing the amount of code step-by-step I was able to narrow the
problem
down. Currently, I can reproduce the behaviour with the following simple
scriptlet:
—%<—
tbl = { }
10_000_000.times do |i|
tbl[‘last’] = iii
end
—>%—
Execution times are as follows:
-
ruby loop0r.rb
real 0m10.741s
user 0m10.596s
sys 0m0.107s -
perl loop0r.pl
real 0m4.503s
user 0m4.420s
sys 0m0.048s -
python loop0r.py
real 0m6.704s
user 0m6.367s
sys 0m0.274s
Replacing the String I use as Hash key with a symbol, i.e., :last,
lowers the
execution time of Ruby dramatically so that it becomes fast than Python
(and
still slower than Perl, but that’s ok).
I was baffled and ran callgrind against the interpreter. It came up with
this:
—%<—
183,248,191 /usr/local/rvm/src/ruby-2.0.0-p247/vm.inc:vm_exec_core’2
155,033,405
/usr/local/rvm/src/ruby-2.0.0-p247/siphash.c:ruby_sip_hash24
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
112,110,714 /usr/local/rvm/src/ruby-2.0.0-p247/insns.def:vm_exec_core’2
93,170,146 /usr/local/rvm/src/ruby-2.0.0-p247/string.c:str_replace
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
78,086,663 /usr/local/rvm/src/ruby-2.0.0-p247/st.c:st_update
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
74,991,856 /usr/local/rvm/src/ruby-2.0.0-p247/gc.c:slot_sweep
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
73,137,809 /usr/local/rvm/src/ruby-2.0.0-
p247/vm_insnhelper.c:vm_call_cfunc_with_frame’2
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
63,043,562
/usr/local/rvm/src/ruby-2.0.0-p247/vm_insnhelper.c:vm_push_frame
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
61,029,008 /usr/local/rvm/src/ruby-2.0.0-p247/vm.c:rb_yield
—>%—
Now: Why does Ruby call string.c:str_replace? There is obviously no
operation
that modifies a string here. In fact, I even use the same String as hash
key
over and over again.
Of course, I can also freeze the String, which amounts to the same as
using a
Symbol (performance-wise). Still, I don’t understand why Ruby is calling
str_replace.
What is happening here? And how can I circumvent it?
Thanks alot for any replies!
--- Eric