Regexp-engine: ruby vs. perl

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello list,

I’ve got a question about ruby’s regexp-engine: I’m wondering why ruby’s
regexp-engine is soo much slower than perl’s.

My test file looks like this (status.dat from nagios):

{
key=value
… ~20 further key=value pairs
}

This file’s size is about 100MB.

[perl – v5.8.8]
time perl -wnl -00 -e ‘print if /host_name=monslave\d+/ and
/service_description=load/ and /servicestatus\s+{[^}]+}/m’
/tmp/status.dat >/dev/null
perl -wnl -00 -e /tmp/status.dat > /dev/null 0.90s user 0.11s system
51% cpu 1.946 total

[ruby19 – ruby 1.9.1p129 (2009-05-12 revision 23412) [i686-linux]]
time ruby19 -wnl -00 -e ‘print if /host_name=monslave\d+/ and
/service_description=load/ and /servicestatus\s+{[^}]+}/m’
/tmp/status.dat >/dev/null
ruby19 -wnl -00 -e /tmp/status.dat > /dev/null 5.13s user 0.15s system
50% cpu 10.449 total

[ruby18 – ruby 1.8.7p5000 (2009-02-19) [i686-linux]]
time ruby18 -wnl -00 -e ‘print if /host_name=monslave\d+/ and
/service_description=load/ and /servicestatus\s+{[^}]+}/m’
/tmp/status.dat >/dev/null
ruby18 -wnl -00 -e /tmp/status.dat > /dev/null 3.93s user 0.05s system
48% cpu 8.153 total

So, both versions of ruby are slower than perl and I’m wondering why.

I’d like to integrate ruby in my daily work (it’s actually a
wonderful/beatiful language) it’s hard to justify when things like the
trivial regexp above is about a factor of 4-5 slower than in perl.
And writing/using regexps is part of my daily work.

Thanks


Freundliche Grüße / Kind regards

Axel S.
Platform Engineer


domainfactory GmbH
Oskar-Messter-Str. 33
85737 Ismaning
Germany

Mobil: +49 (0)176 / 10246727
Telefon: +49 (0)89 / 55266-356
Telefax: +49 (0)89 / 55266-222

E-Mail: [email protected]
Internet: www.df.eu

Registergericht: Amtsgericht München
HRB-Nummer 150294, Geschäftsführer:
Tobia Sara Marburg, Jochen Tuchbreiter
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpSEsIACgkQsuqpduCyZM1hdgCguZab/bhqUBpyCLbEKvIoM2nj
NigAn1pvoVHCzGNIUve+0NgcYprlKCeZ
=tZ+c
-----END PGP SIGNATURE-----

On 7/6/09, Axel S. [email protected] wrote:

I’ve got a question about ruby’s regexp-engine: I’m wondering why ruby’s
regexp-engine is soo much slower than perl’s.

Just guessing here, but usually when regexes are slow it’s because of
backtracking. Since it looks like you don’t need any backtracking in
this little script, you might try throwing in some (?> ) around your
repetitions. (And yes, perl doesn’t require this hack to be fast.
Perl’s probably applying it for you automatically… perl’s regex is
smarter than ruby’s; what can I say?) HTH

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs