Here’s a patch that makes Ruby interpretter slightly faster (~3%) by
optimizing a few commonly used functions (rb_call0, st_*, fix_equal):
And here’s a blog post with details;
Here’s a patch that makes Ruby interpretter slightly faster (~3%) by
optimizing a few commonly used functions (rb_call0, st_*, fix_equal):
And here’s a blog post with details;
I think you need to post that to ruby-lang, actually. Sounds very
interesting though.
On 2/21/07, Tomasz W. [email protected] wrote:
Here’s a patch that makes Ruby interpretter slightly faster (~3%) by
optimizing a few commonly used functions (rb_call0, st_*, fix_equal):And here’s a blog post with details;
Great article. The optimization of ‘switch’ is cute.
And the optimization of ‘fix_equal’ is awesome.
On 2/21/07, Tomasz W. [email protected] wrote:
Definitely a great read, I cannot judge the technical quality in depth
but for what I understand this is reasonable stuff.
I got a stupid question too, did you try the debugger after using -O3?
However you should post this to ruby-core if you hope for your patches
to be accepted.
It might not be worth to do so however in the advent of Ruby2 and the
small improvement.
But I am an ignorant so just try your luck there
Again this is good stuff even if it might not be adopted.
Cheers
Robert
On 21/02/07, Robert D. [email protected] wrote:
I got a stupid question too, did you try the debugger after using -O3?
I never use C debugger for single-stepping. If I want to know what’s
happening inside
a program I usually printf or strace or efence or ltrace or fenris or
objdump or
LD_PRELOAD or something like that.
The only use I have for C debugger is backtrace printing with segfaults.
It works fine with -O3, you just need to pass -g option instead of
-fomit-frame-pointer
(on some platforms -O* causes -fomit-frame-pointer, but not on x86).
In any case, there’s not much difference between -O and -O3 when it
comes
to debugging - any optimizations make single-stepping difficult.
On Feb 20, 11:32 pm, “Tomasz W.”
[email protected] wrote:
Here’s a patch that makes Ruby interpretter slightly faster (~3%) by
optimizing a few commonly used functions (rb_call0, st_*, fix_equal):
*http://zabor.org/taw/ruby-performance-patch-2007-02-21.patch.gzAnd here’s a blog post with details;
*taw's blog: Making Ruby faster–
Tomasz W. [http://t-a-w.blogspot.com/]
3% is a modest improvment. Let’s see what can be
achieved by switching to a language that’s designed
to be fast.
Ruby Lua LuaJIT
tarai 0% 408% 1918%
fib 0% 497% 2874%
mandelbrot 0% 920% 1347%
The source:
local function tarai( x, y, z )
if x <= y then
return y
else
return tarai( tarai(x-1, y, z),
tarai(y-1, z, x),
tarai(z-1, x, y))
end
end
tarai(12, 6, 0)
local function fib( n )
if n < 3 then
return 1
else
return fib(n-1) + fib(n-2)
end
end
local sqrt = math.sqrt
local insert = table.insert
local function c_mul( c1, c2 )
local a,b = unpack(c1)
local c,d = unpack(c2)
return { ac - bd, ad + cb }
end
local function c_abs( c )
local a, b = unpack( c )
return sqrt( aa + bb )
end
local function is_mandelbrot( z )
for i = 1, 100 do
z = c_mul( z, z )
if c_abs( z ) > 2 then
return false
end
end
return true
end
ary = {}
for xx = 0, 100 do
for yy = 0, 100 do
local x, y = xx / 50.0, yy / 50.0
local c = {x, y}
if is_mandelbrot( c ) then
insert( ary, c )
end
end
end
On Thu, 22 Feb 2007 03:25:10 +0100, William J. [email protected]
wrote:
3% is a modest improvment. Let’s see what can be
achieved by switching to a language that’s designed
to be fast.
-1, Troll.
David V.
Hi,
In message “Re: Making Ruby faster”
on Thu, 22 Feb 2007 11:25:10 +0900, “William J.”
[email protected] writes:
|3% is a modest improvment. Let’s see what can be
|achieved by switching to a language that’s designed
|to be fast.
|
Ruby Lua LuaJIT |
---|
tarai 0% 408% 1918% |
fib 0% 497% 2874% |
mandelbrot 0% 920% 1347% |
Here’s my numbers, meaningful or not:
Ruby YARV
def tarai( x, y, z )
if x <= y then
return y
else
return tarai( tarai(x-1, y, z),
tarai(y-1, z, x),
tarai(z-1, x, y))
end
end
p tarai(12, 6, 0)
def fib( n )
if n < 3 then
return 1
else
return fib(n-1) + fib(n-2)
end
end
p fib(12)
include Math
def c_mul( c1, c2 )
a,b = c1
c,d = c2
return ac - bd, ad + cb
end
def c_abs(c)
a, b = c
return sqrt( aa + bb )
end
def is_mandelbrot( z )
for i in 1…100 do
z = c_mul( z, z )
if c_abs( z ) > 2 then
return false
end
end
return true
end
ary = []
for xx in 0…100 do
for yy in 0…100 do
c = [xx / 50.0, yy / 50.0]
if is_mandelbrot( c ) then
ary.push(c)
end
end
end
On 21 feb, 02:32, “Tomasz W.”
[email protected] wrote:
Here’s a patch that makes Ruby interpretter slightly faster (~3%) by
optimizing a few commonly used functions (rb_call0, st_*, fix_equal):
*http://zabor.org/taw/ruby-performance-patch-2007-02-21.patch.gzAnd here’s a blog post with details;
*taw's blog: Making Ruby faster–
Tomasz W. [http://t-a-w.blogspot.com/]
Looks cool. Couple of comments…
if-then-else-switch… Not clear to me why it would be faster than
the switch. Also, most of that stuff is gone in ruby1.9. Seems you
were using ruby1.8, right? You should probably redo this patch
against yarv (see vm.c, for example).
Fixnum improvements. Kind of a good idea, but maybe a problematic
implementation. FIX2LONG() and similar are macros that allow hiding
the implementation details. This could allow in the future to have
Ruby treat Fixnums as true real objects (like Python), without
changing the source code. Currently, yes, they are pretty inefficient
as they do a shift each time, even when unneeded.
Removing the use of the macro may not be such a good idea, as
fix_equal like you have written it could break as soon as Fixnums are
no longer represented as longs, but as a class. You might work around
it by creating your own pseudo FIX2LONG_SANS_SHIFT() macro. That way,
if stuff changes, it will become obvious, and changing the macro to be
== to FIX2LONG() is all it should take.
Also, if I read your code correctly, fix_equal() as you presented it
is slightly different in that it does not check that the pointers are
the same when it is not a fixnum (this means something like a == a may
perform slower – albeit I’ll admit that is a weird case). I would
probably ask in the dev list why that check was needed in the first
place.
Judy is in general not faster than a hash lookup. It is just more
memory efficient.
Ruby might benefit from a better hash, thou, like google’s
dense_hash_map (albeit it is C++ code, which Matz is
against).
You should probably make a separate patch for each improvement you
made, instead of all of them together. That will make it easier to
merge (and evaluate) them in case one of them is rejected but others
accepted.
Other than that, pretty cool. You should keep working on it and join
the ruby dev list to start merging your changes.
Here’s also the docs for sending patches:
http://www.ruby-lang.org/en/community/ruby-core/
On 22/02/07, gga [email protected] wrote:
- Judy is in general not faster than a hash lookup. It is just more
memory efficient.
I cannot fully agree with that characterization, even if it’s correct
considering
only the operation count.
It’s not something you see on typical benchmarks [1], as they tend to
access all
memory with the same frequency, but real programs’ memory access
patterns
more or less follow the power law - very few data structures are
accessed all the time,
and gradually bigger groups of data structures are accessed less
frequently,
ending with most of the program memory being accessed very rarely.
For typical programs computers are pretty good at having the more
commonly
accessed things in faster caches. This means even modest decrease in
data structure sizes will pull frequently accessed data structures into
faster
caches. In most benchmarks all items are equally probably, so the items
pulled
into freed caches tends to be uninteresting.
I’d like to see Judy vs hash tables performance comparison on real
programs,
or at least on more realistic benchmarks.
One more thing - Ruby uses numeric hash tables for symbol lookup.
Symbols ids are highly non-random:
irb> ids = ObjectSpace.enum_for(:each_object,
Module).inject([]){|ms,md| ms + md.instance_methods}.uniq.map{|x|
x.to_sym.object_id}.sort; [ids.min, ids.max, ids.size]
=> [378, 269218, 1061]
That’s exactly the situation Judy is optimized for.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs