=~ vs match - Some benchmarks


#1

While profiling some code I noticed =~, String.match and Regexp.match
turning up in my hotspots. So here is the results of a little
benchmarking to resolve which is faster to use…

require ‘benchmark’

n = 3000000
toad = “frogs”
regex = /frogs/
regexo = /frogs/o

line = “as fun as looking for frogs in custard”
puts “string=’#{line}’”
puts “regex=#{regex}”

Benchmark.bm(20) do |x|
x.report(“string=~/frogs/o” ) { for i in 1…n; line =~ /frogs/o ;
end }
x.report(“string=~/frogs/” ) { for i in 1…n; line =~ /frogs/ ;
end }
x.report(“string=~/#{toad}/o”) { for i in 1…n; line =~ /#{toad}/o;
end }
x.report(“string=~regex” ) { for i in 1…n; line =~ regex ;
end }
x.report(“regex=~string” ) { for i in 1…n; regex =~ line ;
end }
x.report(“string=~regexo” ) { for i in 1…n; line =~ regexo ;
end }
x.report(“string=~/#{toad}/” ) { for i in 1…n; line =~ /#{toad}/ ;
end }
x.report(“regex.match” ) { for i in 1…n; regex.match(line) ;
end }
x.report(“string.match” ) { for i in 1…n; line.match(regex) ;
end }
end

Results on a P4 2.66Ghz using ruby-1.8.3…

string=‘as fun as looking for frogs in custard’
regex=(?-mix:frogs)
user system total real
string=~/frogs/o 4.390000 1.170000 5.560000 ( 5.936591)
string=~/frogs/ 4.390000 1.160000 5.550000 ( 6.922374)
string=~/#{toad}/o 4.450000 1.130000 5.580000 ( 7.111464)
string=~regex 4.850000 1.100000 5.950000 ( 8.030837)
regex=~string 4.690000 1.150000 5.840000 ( 7.055707)
string=~regexo 5.000000 1.120000 6.120000 ( 6.471198)
string=~/#{toad}/ 23.750000 1.290000 25.040000 ( 28.654952)
regex.match 7.180000 1.180000 8.360000 ( 9.444980)
string.match 7.950000 1.170000 9.120000 ( 9.726876)

Results on same machine, using an “all optimization on” version of
ruby-1.9 (CVS version)…

string=‘as fun as looking for frogs in custard’
regex=(?-mix:frogs)
user system total real
string=~/frogs/o 1.950000 0.000000 1.950000 ( 2.005067)
string=~/frogs/ 1.960000 0.010000 1.970000 ( 2.000758)
string=~/#{toad}/o 2.070000 0.000000 2.070000 ( 2.150130)
string=~regex 2.120000 0.000000 2.120000 ( 2.199664)
regex=~string 2.150000 0.000000 2.150000 ( 2.284082)
string=~regexo 2.230000 0.010000 2.240000 ( 2.337823)
string=~/#{toad}/ 12.580000 0.040000 12.620000 ( 14.492860)
regex.match 3.870000 0.010000 3.880000 ( 5.636765)
string.match 4.800000 0.010000 4.810000 ( 5.903205)

John C. Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : removed_email_address@domain.invalid
New Zealand

Carter’s Clarification of Murphy’s Law.

“Things only ever go right so that they may go more spectacularly wrong
later.”

From this principle, all of life and physics may be deduced.


#2

On 12/20/05, John C. removed_email_address@domain.invalid wrote:

While profiling some code I noticed =~, String.match and Regexp.match
turning up in my hotspots. So here is the results of a little
benchmarking to resolve which is faster to use…

Huh. Interesting. Why is
string=~/frogs/o
faster than:
string=~/frogs/
…when there are no substitutions involved?


#3

On 12/20/05, Wilson B. removed_email_address@domain.invalid wrote:

On 12/20/05, John C. removed_email_address@domain.invalid wrote:

While profiling some code I noticed =~, String.match and Regexp.match
turning up in my hotspots. So here is the results of a little
benchmarking to resolve which is faster to use…

Thanks for running these, and posting them. Wonder if there shouldn’t
be a wiki section for benchmarks.

Huh. Interesting. Why is
string=~/frogs/o
faster than:
string=~/frogs/
…when there are no substitutions involved?

Purely a guess… it’s not checking to see if any substitution is
needed on further runs, where by default, it checks every time.


#4

On Wed, 21 Dec 2005, Wilson B. wrote:

On 12/20/05, John C. removed_email_address@domain.invalid wrote:

While profiling some code I noticed =~, String.match and Regexp.match
turning up in my hotspots. So here is the results of a little
benchmarking to resolve which is faster to use…

Huh. Interesting. Why is
string=~/frogs/o
faster than:
string=~/frogs/
…when there are no substitutions involved?

I suspect they are the same to within the experimental error. ie. No
difference.

John C. Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : removed_email_address@domain.invalid
New Zealand

Carter’s Clarification of Murphy’s Law.

“Things only ever go right so that they may go more spectacularly wrong
later.”

From this principle, all of life and physics may be deduced.


#5

John C. wrote:

While profiling some code I noticed =~, String.match and Regexp.match
turning up in my hotspots. So here is the results of a little
benchmarking to resolve which is faster to use… {cut}

string=~regexo 5.000000 1.120000 6.120000 ( 6.471198)
string=~/#{toad}/ 23.750000 1.290000 25.040000 ( 28.654952)
regex.match 7.180000 1.180000 8.360000 ( 9.444980)
string.match 7.950000 1.170000 9.120000 ( 9.726876){cut}

Interesting test, John. :slight_smile:
My results on a P4 1.8Ghz using ruby-1.8.2 on Linux:

string=‘as fun as looking for frogs in custard’
regex=(?-mix:frogs)
user system total real
string=~/frogs/o 3.650000 0.000000 3.650000 ( 3.683330)
string=~/frogs/ 3.520000 0.010000 3.530000 ( 3.560874)
string=~/#{toad}/o 3.520000 0.010000 3.530000 ( 3.612038)
string=~regex 4.110000 0.000000 4.110000 ( 4.159460)
regex=~string 3.970000 0.010000 3.980000 ( 4.009281)
string=~regexo 4.010000 0.010000 4.020000 ( 4.046814)
string=~/#{toad}/ 29.650000 0.150000 29.800000 ( 30.071113)
regex.match 6.920000 0.030000 6.950000 ( 7.015763)
string.match 8.230000 0.210000 8.440000 ( 8.739469)

Antonio


#6

Hi,

At Wed, 21 Dec 2005 09:42:06 +0900,
Bill G. wrote in [ruby-talk:171822]:

Huh. Interesting. Why is
string=~/frogs/o
faster than:
string=~/frogs/
…when there are no substitutions involved?

Purely a guess… it’s not checking to see if any substitution is
needed on further runs, where by default, it checks every time.

No, the check is done at the compilation phase, so Resulted
ASTs should be equivalent. I guess it’s just an error.