Ruby Regexp implementation?

This probably doesn’t belong in Ruby Core. I was reading this article:
http://swtch.com/~rsc/regexp/regexp1.html
http://swtch.com/~rsc/regexp/regexp2.html
http://swtch.com/~rsc/regexp/regexp3.html

And in the first few lines of the first page, it shows that the Regexp
implementation for Perl is absolutely terrible, and it says that Ruby is
“the same” or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
matching? Will Ruby 2.0 use it?

On Mar 12, 2010, at 12:54 PM, Aldric G. wrote:

This probably doesn’t belong in Ruby Core. I was reading this article:
Regular Expression Matching Can Be Simple And Fast
Regular Expression Matching: the Virtual Machine Approach
Regular Expression Matching in the Wild

And in the first few lines of the first page, it shows that the Regexp
implementation for Perl is absolutely terrible, and it says that Ruby is
“the same” or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
matching? Will Ruby 2.0 use it?

I remember reading that article when it was first published.

Ruby switched to a new regex library almost 2 years ago. Check out
http://rubyforge.org/projects/oniguruma for more information.

It would be interesting to see those benchmarks run again with these new
implementations.

cr

Aldric G. wrote:

This probably doesn’t belong in Ruby Core. I was reading this article:
Regular Expression Matching Can Be Simple And Fast
Regular Expression Matching: the Virtual Machine Approach
Regular Expression Matching in the Wild

And in the first few lines of the first page, it shows that the Regexp
implementation for Perl is absolutely terrible, and it says that Ruby is
“the same” or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
matching? Will Ruby 2.0 use it?

def pathologicalize (str, n)
tmp = str[0]n
[str
n + tmp, tmp]
end

puts RUBY_VERSION
1.step(101, 10) do |n|
full_str, part_str = pathologicalize(“a?”, n)
start = Time.now
1000.times do
full_str =~ /#{part_str}/
end
duration = Time.now-start
puts “#{n}:\t#{duration}”
end

#output
1.9.1
1: 0.015625
11: 0.015625
21: 0.015625
31: 0.03125
41: 0.03125
51: 0.046875
61: 0.046875
71: 0.046875
81: 0.0625
91: 0.0625
101: 0.078125

Siep

Hi,

In message “Re: Ruby Regexp implementation?”
on Sat, 13 Mar 2010 03:54:09 +0900, Aldric G.
[email protected] writes:

|And in the first few lines of the first page, it shows that the Regexp
|implementation for Perl is absolutely terrible, and it says that Ruby is
|“the same” or at least similar. Is Ruby 1.8 / 1.9 using the Thompson NFA
|matching? Will Ruby 2.0 use it?

No plan. We need more (human) resource.

          matz.