Regexp backtracking issues in 1.8

Basically Ruby 1.8 fails to backtrack correctly a number of regular
expressions. The simplest one is probably:

irb(main):036:0> /a(ba|.)*?a/.match(‘axba’)
=> nil

There is more to it, and since I had to write it down for my students,
I’m just going to link to the gory details:
http://www.cs.umd.edu/~gaburici/rubyre.html

On 3/7/07, Vasile G. [email protected] wrote:


Posted via http://www.ruby-forum.com/.

I just followed the link to find this jewel:

Ruby 1.8 has a rather buggy regexp engine. Some constructs using
alternation followed by a Kleene star operator will not behave as
expected, or as they do in Python or Perl.

very funny I think

Cheers
Robert

Vasile G. schrieb:

Basically Ruby 1.8 fails to backtrack correctly a number of regular
expressions. The simplest one is probably:

irb(main):036:0> /a(ba|.)*?a/.match(‘axba’)
=> nil

There is more to it, and since I had to write it down for my students,
I’m just going to link to the gory details:
http://www.cs.umd.edu/~gaburici/rubyre.html

It is an Error in Ruby 1.8. In Ruby 1.9 (with Oniguruma) it works
correct
(“ruby-445” is a special naming of myself for some experiments):

Example >>>>>

C:\Dokumente und Einstellungen\wolfgang\Desktop>type regexerr.rb
p “axba”.match(/a(ba|.)*?a/)
C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby -v
ruby 1.8.5 (2006-08-25) [i386-mswin32]

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby regexerr.rb
nil

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby-445 regexerr.rb
#MatchData:0x2ac7ab8

C:\Dokumente und Einstellungen\wolfgang\Desktop>ruby-445 -v
ruby 1.9.0 (2006-10-28) [i386-mingw32]

EoE >>>>>

Wolfgang Nádasi-Donner

Vasile G. wrote:

Basically Ruby 1.8 fails to backtrack correctly a number of regular
expressions. The simplest one is probably:

irb(main):036:0> /a(ba|.)*?a/.match(‘axba’)
=> nil

There is more to it, and since I had to write it down for my students,
I’m just going to link to the gory details:
http://www.cs.umd.edu/~gaburici/rubyre.html

I’ve posted a lot of very short test cases with varying quantifiers. The
results are interesting:
http://nikolasco.livejournal.com/374793.html

Regards,
-Nikolas