On Sun, Oct 07, 2007 at 09:58:02AM +0900, M. Edward (Ed) Borasky wrote:
M. Edward (Ed) Borasky wrote:
OK … here’s the analysis. The attached PDF is what’s known as a box
and whisker plot, usually shortened to “boxplot”. The raw numbers that
went into this are from the Alioth shootout page, and what I haven’t
done is checked out which versions of Perl, Python, YARV, Ruby, jRuby
and PHP these tests used. They could be years old or they could have
been run yesterday morning. I discarded all the tests for which there
were any missing values.
Nice. I wasn’t expecting anything as effective for a quick,
by-the-pants
analysis as a box plot. What did you use to develop the graphic? I’m
curious. . . .
The value plotted is the ratio of seconds for the benchmark on the
dynamic language to the seconds for the benchmark with gcc. Thus, gcc
equals 1.0 across the board and lower is better/faster. How do you
interpret these plots?
Correct me if I’m wrong, but . . . won’t doing this as a gcc-comparative
ratio bias slower languages toward an exaggerated upper bound on the
box?
I admit I haven’t done this sort of thing in a while, so my sorta
heuristic analysis of what I see may be prone to minor glitches.
The whisker on the bottom is approximately the 5th percentile. In other
words, only five percent of the time is your performance going to be
that good or better. And the whisker on the top is approximately the
95th percentile – only five percent of the time is it going to be that
bad or worse.
Both yarv and ruby appear to be prone to low-end statistical outliers,
or
perhaps to a “long tail” on the slow side of things.
is faster. Ruby and jRuby are not in the same neighborhood.
Actually, the ruby and php medians (using lower-case to denote
implementations) are almost identical, and perl’s isn’t far off,
according to this. It appears that only in instances approaching
worst-case does ruby really begin to suffer. Also, if I’m not entirely
missing the implications of a gcc-ratio comparison, I’m not entirely
sure
that we can trust the huge visual differences in the height of the boxes
to indicate significant performance problem areas.
I’m also not sure the jruby implementation’s numbers here are
representative of JRuby’s strengths. Does JRuby benefit from the
benefits of Java’s optimizing VM for long-running processes? I know
it’s
probably suffering, in micro-benchmarks, from increased start-up times
as
it loads the VM.
I’m surprised to see python’s median so low. Are these benchmarks heavy
on bytecode-optimized Python, or do they tend to use standard
interpreted
Python?
yarv 13.2
perl 14.0
python 14.2
php 15.5
ruby 29.9
jruby 55.0
That’s a little clearer (barring the problem of whether jruby benefits
from the optimizations for long-running processes), but of course
suffers
from the lack of attention to statistical outliers and low-end
performance curves. That’s one of the reasons I like box plots.
code – no C libraries for the intensive calculations, no highly-tuned
web servers, databases or other components – you’re going to end up
throwing twice as much hardware at scaling problems in Ruby as you will
in PHP, Perl or Python. The good news is that YARV will level the
playing field.
Unless you’re implying a different set of working conditions, you’re not
going to throw twice as much hardware at the Ruby implementation,
because
the bottlenecks for dynamic language web development (assuming good
design) still haven’t changed. They’re still I/O and network traffic of
various sorts, not the languages.
In my experience, that’s pretty much the sort of thing that happens with
everything that is lax enough on performance needs to “settle” for
something like Perl, et al. That’s the neighborhood I’m talking about
for performance: not so slow it can’t be used for a common desktop app,
not so fast that it should be used for hard-core number-crunching or
graphics-intensive game development.
Neighborhoods are relative, after all. There’s the neighborhood with
C/C++, OCaml, and so on, way up at the top; there’s the neighborhood
with
Perl, Python, Ruby, and so on, somewhere in the middle; there’s the
neighborhood with the languages so slow nobody really uses them, way
down
at the bottom. You’re more likely to find you need to make finer
distinctions in execution speed up near the top, where performance
really matters.
One final note to implementers and language designers … Python gets an
extra little pat on the back from me for having such a low spread. YARV
and other Ruby implementations need to pay attention to the fact that
their boxes are wider than Perl’s and PHP’s and a lot wider than
Python’s. In other words, look at the benchmarks where you really suck
first.
No kidding.
Those of us picking a language for a project might want to check the
specifics of where a language really sucks, too, to determine whether
that’s going to be a problem – but other than that, as long as your
performance needs are simple enough to allow for a dynamic language like
Perl et alii, you might as well just pick the one you like the best. At
least, that’s my take on the matter – look for problem areas that are
deal-breakers, and otherwise don’t worry about it too much unless you’re
writing code that has to fit in eight bits and run like the dickens (for
example). Anything else strikes me as a case of “premature
optimization”, especially considering the difference a good algorithm
can
make.
In case you want the numbers that go with the boxplots, here they are:
gcc yarv python perl php ruby jruby
Low 1 1.2 1.4 0.93 1.4 1.5 3.4
Q1 1 4.8 4.6 2.90 3.1 6.3 12.0
Median 1 8.7 14.0 26.00 31.0 34.0 50.0
Q3 1 68.0 45.0 55.00 55.0 170.0 340.0
High 1 150.0 98.0 67.00 110.0 380.0 410.0
Thanks muchly for the statistics-wrangling. It’s instructive.