Ruby vs. PHP

marcin_r · October 7, 2007, 5:15am

Chad P. wrote:

Would you rather implement a complex object oriented application
architecture in Ruby or Perl, knowing Perl’s obtuse OOP syntax and with a
benchmark-suggested 1.1% performance improvement for Perl, all else being
equal? I’d look at the 1.1% performance improvement and say “So f-ing
what? It’s OOP. Use Ruby.”

Well … I guess I don’t think Perl’s OOP syntax is all that bad. I can
read and write Perl OO scripts (now) and, as I’ve noted before, I am one
of several Perl programmers and one of only one Ruby programmers in my
general work neighborhood. I haven’t tried Python yet – I don’t even
read Python – so I wouldn’t know how its object features compare with
Ruby’s. But I’ll also admit that if it were a mandated OO project, it
would be difficult to rule out Java for any “scripting language”.

I’ve heard incredibly good things about Clean. I’ve also witnessed
something like a meltdown on a mailing list because someone reached a
frustration saturation point dealing with the language. I don’t know if
that was more the language’s fault, or the fault of the programmer’s
background in something like Java. I’m interested in giving it a look at
some point, but I have other stuff in the queue first. It’s probably way
back there around the level of Erlang and Forth for me – both of which
are languages about which I’ve heard great things, but are well outside
the range of my aims right now.

Well, I already know Forth. Unless you’re just a hard-core language
freak or want to get involved with embedded systems, you’d probably be
just as well off ignoring Forth.

Besides, I’m only downloading Clean because the Vector Pascal download
site is down!

marcin_r · October 7, 2007, 2:59am

M. Edward (Ed) Borasky wrote:

Now I could look at the benchmarks themselves and throw out those that
were obviously orthogonal to web servers when comparing languages for
implementing web servers. But I don’t think I need to go that far –
it’s easier to just throw out the statistical outliers and come up with
a set of numbers that’s a general characterization of the relative
performance of the languages. That’s why I asked for the raw data (which
I received, by the way).

[snip]

Of course. I’m still in the process of crunching the numbers, but what
I’ve seen so far is that MRI Ruby is not in the general neighborhood
of Perl, Python or PHP, but YARV is in that neighborhood. I do need to
tighten up the calculations slightly and get rid of some bogus points,
but I’m not expecting a major change in that.

[snip]

Well … let me finish with the numbers, but I think you’ll be surprised
at how close Perl and Python are to each other and how far away MRI is
from the two of them. It’s not a “petty microsecond dispute” for MRI.

[snip]

OK … here’s the analysis. The attached PDF is what’s known as a box
and whisker plot, usually shortened to “boxplot”. The raw numbers that
went into this are from the Alioth shootout page, and what I haven’t
done is checked out which versions of Perl, Python, YARV, Ruby, jRuby
and PHP these tests used. They could be years old or they could have
been run yesterday morning. I discarded all the tests for which there
were any missing values.

The value plotted is the ratio of seconds for the benchmark on the
dynamic language to the seconds for the benchmark with gcc. Thus, gcc
equals 1.0 across the board and lower is better/faster. How do you
interpret these plots?

First of all, the bar across the middle is the median. Half the values
are above the median and half below it. I sorted them by increasing
median, so gcc, being the fastest, is on the left. Next comes YARV, then
Python, Perl, PHP, Ruby and jRuby.

The whisker on the bottom is approximately the 5th percentile. In other
words, only five percent of the time is your performance going to be
that good or better. And the whisker on the top is approximately the
95th percentile – only five percent of the time is it going to be that
bad or worse.

The bottom of the box is the 25th percentile. 25 percent of the time,
the performance will be that good or better. The top of the box is the
75th percentile. 25 percent of the time it will be that bad or worse.

So now we can make more precise Chad P.'s notion of “general
neighborhoods of performance”. We do this by looking at the height of
the median and the width of the box. And we see that relative to gcc,
YARV, Python, Perl and PHP have fairly close medians and the boxes are
about the same width. So they are all “in the same neighborhood” and gcc
is faster. Ruby and jRuby are not in the same neighborhood.

Now suppose your boss comes to you, as bosses do, and says, “well, all
them high-falutin’ box plots are dandy, but the board of directors wants
one number for each language!” It turns out (and I’ll let Google and
Wikipedia fill in the blanks for you) that the one number you want to
give your boss, aside from your cell phone number, is the geometric
mean of all the benchmark ratios. Again, smaller is better. So here’s
how the languages stack up:

gcc 1.0
yarv 13.2
perl 14.0
python 14.2
php 15.5
ruby 29.9
jruby 55.0

So yes, you can pretty much expect the same performance from YARV, Perl,
Python and PHP. And you can pretty much expect something like a 13 - 16
to one speed improvement if you decide to rewrite your application in C.

It’s pretty clear to me from these numbers is that the only reason that
deploying web applications on the LAMP stack and its cousins using
PostgreSQL, Perl, Python and Ruby is economically viable is that they
spend most of their time either in the database or on the network. The
bad news is that for an application with 100 percent dynamic language
code – no C libraries for the intensive calculations, no highly-tuned
web servers, databases or other components – you’re going to end up
throwing twice as much hardware at scaling problems in Ruby as you will
in PHP, Perl or Python. The good news is that YARV will level the
playing field.

One final note to implementers and language designers … Python gets an
extra little pat on the back from me for having such a low spread. YARV
and other Ruby implementations need to pay attention to the fact that
their boxes are wider than Perl’s and PHP’s and a lot wider than
Python’s. In other words, look at the benchmarks where you really suck
first.

In case you want the numbers that go with the boxplots, here they are:

     gcc  yarv python  perl   php  ruby jruby

Low 1 1.2 1.4 0.93 1.4 1.5 3.4
Q1 1 4.8 4.6 2.90 3.1 6.3 12.0
Median 1 8.7 14.0 26.00 31.0 34.0 50.0
Q3 1 68.0 45.0 55.00 55.0 170.0 340.0
High 1 150.0 98.0 67.00 110.0 380.0 410.0

marcin_r · October 7, 2007, 5:28am

On Sun, Oct 07, 2007 at 09:58:02AM +0900, M. Edward (Ed) Borasky wrote:

M. Edward (Ed) Borasky wrote:

OK … here’s the analysis. The attached PDF is what’s known as a box
and whisker plot, usually shortened to “boxplot”. The raw numbers that
went into this are from the Alioth shootout page, and what I haven’t
done is checked out which versions of Perl, Python, YARV, Ruby, jRuby
and PHP these tests used. They could be years old or they could have
been run yesterday morning. I discarded all the tests for which there
were any missing values.

Nice. I wasn’t expecting anything as effective for a quick,
by-the-pants
analysis as a box plot. What did you use to develop the graphic? I’m
curious. . . .

The value plotted is the ratio of seconds for the benchmark on the
dynamic language to the seconds for the benchmark with gcc. Thus, gcc
equals 1.0 across the board and lower is better/faster. How do you
interpret these plots?

Correct me if I’m wrong, but . . . won’t doing this as a gcc-comparative
ratio bias slower languages toward an exaggerated upper bound on the
box?

I admit I haven’t done this sort of thing in a while, so my sorta
heuristic analysis of what I see may be prone to minor glitches.

The whisker on the bottom is approximately the 5th percentile. In other
words, only five percent of the time is your performance going to be
that good or better. And the whisker on the top is approximately the
95th percentile – only five percent of the time is it going to be that
bad or worse.

Both yarv and ruby appear to be prone to low-end statistical outliers,
or
perhaps to a “long tail” on the slow side of things.

is faster. Ruby and jRuby are not in the same neighborhood.
Actually, the ruby and php medians (using lower-case to denote
implementations) are almost identical, and perl’s isn’t far off,
according to this. It appears that only in instances approaching
worst-case does ruby really begin to suffer. Also, if I’m not entirely
missing the implications of a gcc-ratio comparison, I’m not entirely
sure
that we can trust the huge visual differences in the height of the boxes
to indicate significant performance problem areas.

I’m also not sure the jruby implementation’s numbers here are
representative of JRuby’s strengths. Does JRuby benefit from the
benefits of Java’s optimizing VM for long-running processes? I know
it’s
probably suffering, in micro-benchmarks, from increased start-up times
as
it loads the VM.

I’m surprised to see python’s median so low. Are these benchmarks heavy
on bytecode-optimized Python, or do they tend to use standard
interpreted
Python?

yarv 13.2
perl 14.0
python 14.2
php 15.5
ruby 29.9
jruby 55.0

That’s a little clearer (barring the problem of whether jruby benefits
from the optimizations for long-running processes), but of course
suffers
from the lack of attention to statistical outliers and low-end
performance curves. That’s one of the reasons I like box plots.

code – no C libraries for the intensive calculations, no highly-tuned
web servers, databases or other components – you’re going to end up
throwing twice as much hardware at scaling problems in Ruby as you will
in PHP, Perl or Python. The good news is that YARV will level the
playing field.

Unless you’re implying a different set of working conditions, you’re not
going to throw twice as much hardware at the Ruby implementation,
because
the bottlenecks for dynamic language web development (assuming good
design) still haven’t changed. They’re still I/O and network traffic of
various sorts, not the languages.

In my experience, that’s pretty much the sort of thing that happens with
everything that is lax enough on performance needs to “settle” for
something like Perl, et al. That’s the neighborhood I’m talking about
for performance: not so slow it can’t be used for a common desktop app,
not so fast that it should be used for hard-core number-crunching or
graphics-intensive game development.

Neighborhoods are relative, after all. There’s the neighborhood with
C/C++, OCaml, and so on, way up at the top; there’s the neighborhood
with
Perl, Python, Ruby, and so on, somewhere in the middle; there’s the
neighborhood with the languages so slow nobody really uses them, way
down
at the bottom. You’re more likely to find you need to make finer
distinctions in execution speed up near the top, where performance
really matters.

One final note to implementers and language designers … Python gets an
extra little pat on the back from me for having such a low spread. YARV
and other Ruby implementations need to pay attention to the fact that
their boxes are wider than Perl’s and PHP’s and a lot wider than
Python’s. In other words, look at the benchmarks where you really suck
first.

No kidding.

Those of us picking a language for a project might want to check the
specifics of where a language really sucks, too, to determine whether
that’s going to be a problem – but other than that, as long as your
performance needs are simple enough to allow for a dynamic language like
Perl et alii, you might as well just pick the one you like the best. At
least, that’s my take on the matter – look for problem areas that are
deal-breakers, and otherwise don’t worry about it too much unless you’re
writing code that has to fit in eight bits and run like the dickens (for
example). Anything else strikes me as a case of “premature
optimization”, especially considering the difference a good algorithm
can
make.

In case you want the numbers that go with the boxplots, here they are:
    gcc  yarv python  perl   php  ruby jruby
Low 1 1.2 1.4 0.93 1.4 1.5 3.4
Q1 1 4.8 4.6 2.90 3.1 6.3 12.0
Median 1 8.7 14.0 26.00 31.0 34.0 50.0
Q3 1 68.0 45.0 55.00 55.0 170.0 340.0
High 1 150.0 98.0 67.00 110.0 380.0 410.0

Thanks muchly for the statistics-wrangling. It’s instructive.

marcin_r · October 7, 2007, 5:39am

On Sun, Oct 07, 2007 at 12:14:29PM +0900, M. Edward (Ed) Borasky wrote:

general work neighborhood. I haven’t tried Python yet – I don’t even
read Python – so I wouldn’t know how its object features compare with
Ruby’s. But I’ll also admit that if it were a mandated OO project, it
would be difficult to rule out Java for any “scripting language”.

Perl’s is bolted on and fugly. I usually disagree when people say
Perl’s
syntax is inherently ugly – it can lend itself to very elegant (and
readable!) code in the hands of someone who knows what (s)he’s doing.
In
the case of OO Perl, however, I fear I must agree that it’s just
hideous,
and has a little more scaffolding cluttering things up than it should.

Python reads a bit like Ruby without Ruby’s elegant idioms (such as
iterator blocks), with additional explicitness in cases like calls to
self, and (for me at least) a disconcerting tendency to make my eyes
want
to run off the right-hand edge of the “page” and to find myself
wondering
where the rest of a block of code went, even if it’s actually complete
as
written. Python just generally looks inherently broken, like someone
left a messy, half-finished bunch of hacked-up clutter on the page.
It’s
amazing how much an end-delimiter (or lack thereof) makes in readability
for me; with Python, I have to kind of reorder my brain just to get past
the fact that the code always looks broken and unfinished before I can
make sense of it. I’m also mildly put off by the mixture of methods and
core functions – which, in contrast to Perl’s bolted-on OOP, makes
Python’s non-OOP look bolted on (but less avoidable than Perl’s OOP
stuff).

The general OO structure of it is roughly comparable to Ruby’s in many
ways, though.

As for Java, I’d find it difficult to convince myself to use it for
anything where the benefits of the optimizing VM’s long-running
performance characteristics weren’t critical to the app.

Well, I already know Forth. Unless you’re just a hard-core language
freak or want to get involved with embedded systems, you’d probably be
just as well off ignoring Forth.

I have some interest in embedded systems – eventually. I also like
learning things in general, and learning things that give me new
perspectives and help me broaden the way I think about problems in
specific. What I’ve heard about Forth makes me think that’d be a good
one to add to the repertoire . . . some day. Haskell’s a bit further up
the list, though.

Besides, I’m only downloading Clean because the Vector Pascal download
site is down!

Whatever excuse helps you get up in the morning, I guess.

marcin_r · October 7, 2007, 7:02am

On Sun, Oct 07, 2007 at 01:23:15PM +0900, M. Edward (Ed) Borasky wrote:

were any missing values.

Nice. I wasn’t expecting anything as effective for a quick, by-the-pants
analysis as a box plot. What did you use to develop the graphic? I’m
curious. . . .

R … and no, I haven’t ported the benchmarks to R … and yes, I should.

I also suppose I should post the R code to RubyForge.

Then maybe we can turn a translation of that R code into Ruby into a
Ruby
Quiz with some kind of additional requirements . . .

. . . or not.

Correct me if I’m wrong, but . . . won’t doing this as a gcc-comparative
ratio bias slower languages toward an exaggerated upper bound on the box?

I admit I haven’t done this sort of thing in a while, so my sorta
heuristic analysis of what I see may be prone to minor glitches.

Well, actually, yes, the gcc “box” is a meaningless point. Also, the
log of the ratio is better behaved than the ratio itself. It’s more
like a Gaussian, which is why the whole geometric mean thing works.

Good to know I wasn’t getting high on the smell of dishwasher detergent
when I questioned the choice of a ratio to gcc performance.

missing the implications of a gcc-ratio comparison, I’m not entirely sure
that we can trust the huge visual differences in the height of the boxes
to indicate significant performance problem areas.

OK … you’ve convinced me … I actually computed the logs, so I might
as well make the boxplots of them too.

Thanks!

These results look a lot more like what I expected, especially with the
high-performance tail weighting for main Perl and PHP implementations.
I’m still pretty impressed with Python’s median performance here, as
contrasted with what I expected – unless those are bytecode performance
benchmarks, in which case it’s not quite as surprising. Also, since the
kind of work I tend to do doesn’t really lend itself to persistent
bytecode compilation, I’d be less inclined to consider those benchmarks
(assuming I thought performance benchmarking in this class of languages
was likely to benefit me much in language choice at all).

Also . . . the log of ratio box plots don’t make JRuby look like such a
dog in comparison with the rest, which is nice to see, even though it
still lags overall here.

By the way, reading this sentence of yours was the point at which I
thought “Y’know, I like disagreeing with this guy more than I like
agreeing with most other people I encounter in the Internet.” A
disagreement with you seems to be going quite well – polite,
informative, and educational. I’m learning while I’m stubbornly
disputing you, in other words. Thanks.

I’m also not sure the jruby implementation’s numbers here are
representative of JRuby’s strengths. Does JRuby benefit from the
benefits of Java’s optimizing VM for long-running processes? I know it’s
probably suffering, in micro-benchmarks, from increased start-up times as
it loads the VM.

I think it’s more a case of this being an old jRuby – Charlie tunes it
more or less daily and I’ve seen some benchmarks where it beats MRI.

I’m curious about both newer numbers and the effects of VM startup and
VM
optimization on performance here – though not quite curious enough to
bother installing it myself.

I’m surprised to see python’s median so low. Are these benchmarks heavy
on bytecode-optimized Python, or do they tend to use standard interpreted
Python?

I’ll have to check that … I assumed it was bytecode.

Here’s the boxplot of the log (base 10) ratios. Very interesting –
Perl has a heavier tail on the fast side than it does on the slow side.

I’ve known for a long time that Perl does some impressive things with
execution performance. The JIT parse tree compilation it does is a big
piece of that – as is the fine-tuning the JIT compilation process has
gotten over the years.

marcin_r · October 7, 2007, 6:25am

Chad P. wrote:

Nice. I wasn’t expecting anything as effective for a quick, by-the-pants
analysis as a box plot. What did you use to develop the graphic? I’m
curious. . . .

R … and no, I haven’t ported the benchmarks to R … and yes, I
should.

I also suppose I should post the R code to RubyForge.

Correct me if I’m wrong, but . . . won’t doing this as a gcc-comparative
ratio bias slower languages toward an exaggerated upper bound on the box?

I admit I haven’t done this sort of thing in a while, so my sorta
heuristic analysis of what I see may be prone to minor glitches.

Well, actually, yes, the gcc “box” is a meaningless point. Also, the
log of the ratio is better behaved than the ratio itself. It’s more
like a Gaussian, which is why the whole geometric mean thing works.

Both yarv and ruby appear to be prone to low-end statistical outliers, or
perhaps to a “long tail” on the slow side of things.

Yeah … again, it looks better with the log of the ratios.

Actually, the ruby and php medians (using lower-case to denote
implementations) are almost identical, and perl’s isn’t far off,
according to this. It appears that only in instances approaching
worst-case does ruby really begin to suffer. Also, if I’m not entirely
missing the implications of a gcc-ratio comparison, I’m not entirely sure
that we can trust the huge visual differences in the height of the boxes
to indicate significant performance problem areas.

OK … you’ve convinced me … I actually computed the logs, so I might
as well make the boxplots of them too.

I’m also not sure the jruby implementation’s numbers here are
representative of JRuby’s strengths. Does JRuby benefit from the
benefits of Java’s optimizing VM for long-running processes? I know it’s
probably suffering, in micro-benchmarks, from increased start-up times as
it loads the VM.

I think it’s more a case of this being an old jRuby – Charlie tunes it
more or less daily and I’ve seen some benchmarks where it beats MRI.

I’m surprised to see python’s median so low. Are these benchmarks heavy
on bytecode-optimized Python, or do they tend to use standard interpreted
Python?

I’ll have to check that … I assumed it was bytecode.

Here’s the boxplot of the log (base 10) ratios. Very interesting –
Perl has a heavier tail on the fast side than it does on the slow
side.

marcin_r · October 7, 2007, 3:15pm

On Sun, 7 Oct 2007, M. Edward (Ed) Borasky wrote:

gcc 1.0
yarv 13.2
perl 14.0
python 14.2
php 15.5
ruby 29.9
jruby 55.0

.
.
.

most of their time either in the database or on the network. The bad news is
that for an application with 100 percent dynamic language code – no C
libraries for the intensive calculations, no highly-tuned web servers,
databases or other components – you’re going to end up throwing twice as
much hardware at scaling problems in Ruby as you will in PHP, Perl or Python.
The good news is that YARV will level the playing field.

And THAT is where microbenchmarks can lead one astray. Compare PHP to
Ruby, and objectively, PHP is faster. However, it often takes a lot
more
PHP code than Ruby code to do a complex task. So, when you start
comparing complex systems written in both, that speed differential goes
away. The complex Ruby system often ends up comparable to, or even
faster
than the PHP system.

Kirk H.