Forum: Ruby ruby in 50 milliseconds or less

47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2009-07-18 06:57
(Received via mailing list)
If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

$ time RUBYOPT='' ruby -e 1
ruby -e 1  0.01s user 0.00s system 105% cpu 0.011 total
$ time RUBYOPT='rubygems' ruby -e 1
RUBYOPT='rubygems' ruby -e 1  0.58s user 0.06s system 94% cpu 0.675
total

This is greatly improved in 1.9, which has gems built in.

$ time RUBYOPT='rubygems' ruby19 -e 1
RUBYOPT='rubygems' ruby19 -e 1  0.02s user 0.01s system 48% cpu 0.067
total

An order of magnitude improvement makes the delay much more acceptable,
but if you're working with 1.8, that's not an option.

So here's a hack for 1.8 that restores the speed of bare-metal ruby but
still lets you use gems. What it does is redefine Kernel#require to try
loading things without rubygems, but fall back to using rubygems when
there is a load failure.

Put the file in a dir on your $LOAD_PATH, and set RUBYOPT to reference
it, as shown below. *Note:* I haven't tested this widely yet. It may
break libraries that do their own hacking with require or use LOAD_ERROR
for their own devious purposes. I advise not using this hack in
production code without careful testing.

$ cat gem-fallback.rb
module Kernel
   req = method :require
   define_method :require do |*args|
     begin
       req.call(*args)
     rescue LoadError
       Kernel.module_eval do
         define_method(:require, &req)
       end
       require 'rubygems'
       require(*args)
     end
   end
end

$ time RUBYOPT='rgem-fallback' ruby -e 1
RUBYOPT='rgem-fallback' ruby -e 1  0.01s user 0.00s system 71% cpu 0.011
total

$ time RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"
RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"  0.60s user 0.07s
system 79% cpu 0.850 total
Bec38d63650c8912b6ba9b557fb953b9?d=identicon&s=25 Roger Pack (rogerdpack)
on 2009-07-18 21:09
Joel VanderWerf wrote:
> If you use ruby 1.8 for quick command line tasks, and you use gems, you
> may notice that the interpreter has an execution overhead that is small
> but noticeable and irritating when repeated often enough.

I've noticed this too.
My solution: a fake gem_prelude :)
Great minds think alike.
It would be interesting to time things tho.
http://github.com/rogerdpack/faster_rubygems/tree/master
Cheers!
=r
753dcb78b3a3651127665da4bed3c782?d=identicon&s=25 Brian Candler (candlerb)
on 2009-07-18 21:22
Or simpler: only put "require 'rubygems'" at the top of scripts which
use rubygems.

(Obviously less convenient than using RUBYOPT of course, but your script
may be more portable)
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2009-07-18 21:25
(Received via mailing list)
Roger Pack wrote:
> http://github.com/rogerdpack/faster_rubygems/tree/master

Looks nice, but it's solving a different problem, isn't it? It appears
that you're actually speeding up the gem loading process. My hack only
makes a difference if you're running a script that doesn't use gems at
all.

Put them together and it's a win in both cases!
Bec38d63650c8912b6ba9b557fb953b9?d=identicon&s=25 Roger Pack (rogerdpack)
on 2009-07-18 21:27
> Looks nice, but it's solving a different problem, isn't it? It appears
> that you're actually speeding up the gem loading process. My hack only
> makes a difference if you're running a script that doesn't use gems at
> all.
>
> Put them together and it's a win in both cases!

It's genius! :)
=r
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2009-07-18 21:47
(Received via mailing list)
Brian Candler wrote:
> Or simpler: only put "require 'rubygems'" at the top of scripts which
> use rubygems.
>
> (Obviously less convenient than using RUBYOPT of course, but your script
> may be more portable)

Except I'd rather not have to guess/remember which things are installed
as gems, and do it correctly on each host I'm running the script on. So
the require hack figures that out for me. (We do some embedded work on
smartphones, gumstix, and geode, so some of our systems don't use gems
at all.)
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2009-07-18 21:53
(Received via mailing list)
Here's an extreme example where this makes a huge difference:

I have a dir tree with large numbers of small gps log files, in CSV
format, and I want to use ruby -a (autosplit) to work with them.

With RUBYOPT=rgem-fallback (or of course RUBYOPT=''):

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
RUBYOPT='' find . -type f -exec ruby -F, -ane '$F' {} \;  2.06s user
1.67s system 39% cpu 9.431 total

With RUBYOPT=rubygems:

$ time find . -type f -exec ruby -F, -ane '$F' {} \;
find . -type f -exec ruby -F, -ane '$F' {} \;  219.02s user 61.52s
system 93% cpu 4:59.26 total

Of course, awk would probably be even faster, but ...
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2009-07-18 22:17
(Received via mailing list)
Roger Pack wrote:
> http://github.com/rogerdpack/faster_rubygems/tree/master

On linux, faster_rubygems seems to have even more of an impact than on
the windows installation you benchmarked. With about 250 gems installed:

$ time ruby examples/require_rubygems_normal.rb
done
ruby examples/require_rubygems_normal.rb  0.57s user 0.05s system 85%
cpu 0.726 total

$ time ruby examples/require_fast_start.rb
done
ruby examples/require_fast_start.rb  0.04s user 0.02s system 46% cpu
0.121 total

Very nice!

I had been thinking of something along similar, locating all gem lib
dirs. But instead of pushing them all on $:, the idea was to set up a
single dir with symlinks to all the gem lib dirs. I expect it would be
faster because it would offload more of the path search to the
filesystem, rather than to ruby.
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2009-09-12 22:08
(Received via mailing list)
Joel VanderWerf wrote:
> This is greatly improved in 1.9, which has gems built in.
> there is a load failure.
>   define_method :require do |*args|
> end
>
> $ time RUBYOPT='rgem-fallback' ruby -e 1
> RUBYOPT='rgem-fallback' ruby -e 1  0.01s user 0.00s system 71% cpu 0.011
> total
>
> $ time RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"
> RUBYOPT='rgem-fallback' ruby -e "require 'tagz'"  0.60s user 0.07s
> system 79% cpu 0.850 total

An update, in case anyone uses this: the sinatra gem uses some black
magic involving #caller, and the presence of this additional require
method on the call stack will confuse sinatra into thinking it is not in
"run" mode and it will not parse ARGV. You can fix this by setting a
constant when loading sinatra, as in below. (To reiterate, I don't
recommend this for production code. This is mostly for fast startup when
using ruby from the command line. For production code, I am using the
crown tool that I announced a few weeks ago[1].)

module Kernel
   req = method :require
   define_method :require do |*args|
     begin
       req.call(*args)
     rescue LoadError => ex
       Kernel.module_eval do
         define_method(:require, &req)
       end
       require 'rubygems'
       if args.grep(/sinatra/).any?
         pat = /gem-fallback.rb/
         if defined?(RUBY_IGNORE_CALLERS)
           RUBY_IGNORE_CALLERS << pat
         else
           RUBY_IGNORE_CALLERS = [pat]
         end
       end
       require(*args)
     end
   end
end

[1] http://github.com/vjoel/crown/tree/master
703fbc991fd63e0e1db54dca9ea31b53?d=identicon&s=25 Robert Dober (Guest)
on 2009-09-13 13:27
(Received via mailing list)
On Sat, Jul 18, 2009 at 9:52 PM, Joel VanderWerf
<vjoel@path.berkeley.edu> wrote:
> system 39% cpu 9.431 total
>
> With RUBYOPT=rubygems:
>
> $ time find . -type f -exec ruby -F, -ane '$F' {} \;
> find . -type f -exec ruby -F, -ane '$F' {} \;  219.02s user 61.52s system
> 93% cpu 4:59.26 total
>
> Of course, awk would probably be even faster, but ...
... that would mean using the right tool for the right task ;)
Sorry couldn't resist. This however does not mean that your
contribution is not very valuable, because Ruby will be the right tool
often enough and even here, maybe you have a team where everybody
knows Ruby but few know awk....
Cheers
Robert
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2009-09-13 20:35
(Received via mailing list)
Robert Dober wrote:
> On Sat, Jul 18, 2009 at 9:52 PM, Joel VanderWerf
>> Of course, awk would probably be even faster, but ...
> ... that would mean using the right tool for the right task ;)

and where's the fun in that! ;)
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.