Robert D. wrote:
Links?
I am looking forward to this.
Cheers
Robert
Yeah, I’m with you. I actually took a look at the shootout page. First
of all, it isn’t as bad a site as some people make it out to be. Second,
they are running Debian and Gentoo, which means almost anyone could
duplicate their work (assuming the whole enchilada can be downloaded as
a tarball).
Analysis Phase (Trick 1):
-
Collect the whole matrix of benchmarks. The rows will be benchmark
names and the columns will be languages, and the cells in the matrix
will be benchmark run times. Pick a language to be the “standard”. C is
probably the obvious choice, since it’s likely to be the most “practical
low-level language” (meaning not as many folks know Forth.) 
-
Now you compute the natural log of ratios of the times for all the
languages to the standard for each of the benchmarks. In some convenient
statistics package (A spreadsheet works fine, but I’d do it in R because
the kernel density estimators, boxplots, etc. are built in), compute the
histograms (or kernel density estimators, or boxplots, or all of the
above) of the ratios for each language. That tells you how the ratios
are distributed.
Example:
Ruby Perl Python PHP C
Bench1 tr1 tp1 ty1 th1 tc1
Bench2 tr2 tp2 ty2 th2 tc2
Bench3 tr3 tp3 ty3 th3 tc3
Ruby Perl Python PHP C
Bench1 ln(tr1/tc1) ln(tp1/tc1) . . 1
Bench2 ln(tr2/tc2) ln(tp2/tc2) . . 1
Bench3 ln(tr3/tc3) ln(tp3/tc3) . . 1
And then take the histograms of the columns (smaller is better).
Tuning Phase (Trick 2):
Find the midpoints on the density curves, boxplots or histograms. These
are the “typical” benchmarks. They are more representative than the
“outliers”. I saw one, for example, where Ruby was over 100 times as
fast as Perl. That’s not worth investing any time in – it’s some kind
of fluke, something either Perl sucks at, Ruby is wonderful at, or a
better implementation in the Ruby code than the Perl code.
Now you build a “profiling Ruby”, run the mid-range benchmarks with
profiling, and see where Ruby is spending its time. If you happen to
have a friend on the YARV team or the Cardinal team, have them run the
benchmarks too.
Some other tricks:
Once you know where Ruby is spending its time, play with compiler flags.
gcc has oodles of possible optimizations, and gcc itself was tuned by
processes like this. It’s worth spending a lot of time compiling the
Ruby interpreter, since it’s going to be run often.
Those are simple “low-hanging fruit” tricks … stuff you can do without
actually knowing what’s going on inside the Ruby interpreter. It will be
painfully obvious from the profiles, I think, where the opportunities
are.