Ruby in 50 milliseconds or less

Joel_VanderWerf · July 18, 2009, 6:57am

If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

$ time RUBYOPT=’’ ruby -e 1
ruby -e 1 0.01s user 0.00s system 105% cpu 0.011 total
$ time RUBYOPT=‘rubygems’ ruby -e 1
RUBYOPT=‘rubygems’ ruby -e 1 0.58s user 0.06s system 94% cpu 0.675
total

This is greatly improved in 1.9, which has gems built in.

$ time RUBYOPT=‘rubygems’ ruby19 -e 1
RUBYOPT=‘rubygems’ ruby19 -e 1 0.02s user 0.01s system 48% cpu 0.067
total

An order of magnitude improvement makes the delay much more acceptable,
but if you’re working with 1.8, that’s not an option.

So here’s a hack for 1.8 that restores the speed of bare-metal ruby but
still lets you use gems. What it does is redefine Kernel#require to try
loading things without rubygems, but fall back to using rubygems when
there is a load failure.

Put the file in a dir on your $LOAD_PATH, and set RUBYOPT to reference
it, as shown below. Note: I haven’t tested this widely yet. It may
break libraries that do their own hacking with require or use LOAD_ERROR
for their own devious purposes. I advise not using this hack in
production code without careful testing.

$ cat gem-fallback.rb
module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError
Kernel.module_eval do
define_method(:require, &req)
end
require ‘rubygems’
require(*args)
end
end
end

$ time RUBYOPT=‘rgem-fallback’ ruby -e 1
RUBYOPT=‘rgem-fallback’ ruby -e 1 0.01s user 0.00s system 71% cpu 0.011
total

$ time RUBYOPT=‘rgem-fallback’ ruby -e “require ‘tagz’”
RUBYOPT=‘rgem-fallback’ ruby -e “require ‘tagz’” 0.60s user 0.07s
system 79% cpu 0.850 total

Joel_VanderWerf · July 18, 2009, 9:22pm

Or simpler: only put “require ‘rubygems’” at the top of scripts which
use rubygems.

(Obviously less convenient than using RUBYOPT of course, but your script
may be more portable)

Joel_VanderWerf · July 18, 2009, 9:09pm

Joel VanderWerf wrote:

If you use ruby 1.8 for quick command line tasks, and you use gems, you
may notice that the interpreter has an execution overhead that is small
but noticeable and irritating when repeated often enough.

I’ve noticed this too.
My solution: a fake gem_prelude
Great minds think alike.
It would be interesting to time things tho.
http://github.com/rogerdpack/faster_rubygems/tree/master
Cheers!
=r

Joel_VanderWerf · July 18, 2009, 9:25pm

Roger P. wrote:

http://github.com/rogerdpack/faster_rubygems/tree/master

Looks nice, but it’s solving a different problem, isn’t it? It appears
that you’re actually speeding up the gem loading process. My hack only
makes a difference if you’re running a script that doesn’t use gems at
all.

Put them together and it’s a win in both cases!

Joel_VanderWerf · July 18, 2009, 9:27pm

Looks nice, but it’s solving a different problem, isn’t it? It appears
that you’re actually speeding up the gem loading process. My hack only
makes a difference if you’re running a script that doesn’t use gems at
all.

Put them together and it’s a win in both cases!

It’s genius!
=r

Joel_VanderWerf · July 18, 2009, 9:53pm

Here’s an extreme example where this makes a huge difference:

I have a dir tree with large numbers of small gps log files, in CSV
format, and I want to use ruby -a (autosplit) to work with them.

With RUBYOPT=rgem-fallback (or of course RUBYOPT=’’):

$ time find . -type f -exec ruby -F, -ane ‘$F’ {} ;
RUBYOPT=’’ find . -type f -exec ruby -F, -ane ‘$F’ {} ; 2.06s user
1.67s system 39% cpu 9.431 total

With RUBYOPT=rubygems:

$ time find . -type f -exec ruby -F, -ane ‘$F’ {} ;
find . -type f -exec ruby -F, -ane ‘$F’ {} ; 219.02s user 61.52s
system 93% cpu 4:59.26 total

Of course, awk would probably be even faster, but …

Joel_VanderWerf · July 18, 2009, 9:47pm

Brian C. wrote:

Or simpler: only put “require ‘rubygems’” at the top of scripts which
use rubygems.

(Obviously less convenient than using RUBYOPT of course, but your script
may be more portable)

Except I’d rather not have to guess/remember which things are installed
as gems, and do it correctly on each host I’m running the script on. So
the require hack figures that out for me. (We do some embedded work on
smartphones, gumstix, and geode, so some of our systems don’t use gems
at all.)

Joel_VanderWerf · September 12, 2009, 10:08pm

Joel VanderWerf wrote:

This is greatly improved in 1.9, which has gems built in.
there is a load failure.
define_method :require do |*args|
end

$ time RUBYOPT=‘rgem-fallback’ ruby -e 1
RUBYOPT=‘rgem-fallback’ ruby -e 1 0.01s user 0.00s system 71% cpu 0.011
total

$ time RUBYOPT=‘rgem-fallback’ ruby -e “require ‘tagz’”
RUBYOPT=‘rgem-fallback’ ruby -e “require ‘tagz’” 0.60s user 0.07s
system 79% cpu 0.850 total

An update, in case anyone uses this: the sinatra gem uses some black
magic involving #caller, and the presence of this additional require
method on the call stack will confuse sinatra into thinking it is not in
“run” mode and it will not parse ARGV. You can fix this by setting a
constant when loading sinatra, as in below. (To reiterate, I don’t
recommend this for production code. This is mostly for fast startup when
using ruby from the command line. For production code, I am using the
crown tool that I announced a few weeks ago[1].)

module Kernel
req = method :require
define_method :require do |*args|
begin
req.call(*args)
rescue LoadError => ex
Kernel.module_eval do
define_method(:require, &req)
end
require ‘rubygems’
if args.grep(/sinatra/).any?
pat = /gem-fallback.rb/
if defined?(RUBY_IGNORE_CALLERS)
RUBY_IGNORE_CALLERS << pat
else
RUBY_IGNORE_CALLERS = [pat]
end
end
require(*args)
end
end
end

[1] GitHub - vjoel/crown: Gather gem lib and bin files under one directory for fast loading and predictable behavior.

Joel_VanderWerf · July 18, 2009, 10:17pm

Roger P. wrote:

http://github.com/rogerdpack/faster_rubygems/tree/master

On linux, faster_rubygems seems to have even more of an impact than on
the windows installation you benchmarked. With about 250 gems installed:

$ time ruby examples/require_rubygems_normal.rb
done
ruby examples/require_rubygems_normal.rb 0.57s user 0.05s system 85%
cpu 0.726 total

$ time ruby examples/require_fast_start.rb
done
ruby examples/require_fast_start.rb 0.04s user 0.02s system 46% cpu
0.121 total

Very nice!

I had been thinking of something along similar, locating all gem lib
dirs. But instead of pushing them all on $:, the idea was to set up a
single dir with symlinks to all the gem lib dirs. I expect it would be
faster because it would offload more of the path search to the
filesystem, rather than to ruby.

Joel_VanderWerf · September 13, 2009, 8:35pm

Robert D. wrote:

On Sat, Jul 18, 2009 at 9:52 PM, Joel VanderWerf

Of course, awk would probably be even faster, but …
… that would mean using the right tool for the right task

and where’s the fun in that!

Joel_VanderWerf · September 13, 2009, 1:27pm

On Sat, Jul 18, 2009 at 9:52 PM, Joel VanderWerf
[email protected] wrote:

system 39% cpu 9.431 total

With RUBYOPT=rubygems:

$ time find . -type f -exec ruby -F, -ane ‘$F’ {} ;
find . -type f -exec ruby -F, -ane ‘$F’ {} ; 219.02s user 61.52s system
93% cpu 4:59.26 total

Of course, awk would probably be even faster, but …
… that would mean using the right tool for the right task
Sorry couldn’t resist. This however does not mean that your
contribution is not very valuable, because Ruby will be the right tool
often enough and even here, maybe you have a team where everybody
knows Ruby but few know awk…
Cheers
Robert