2008/9/19 Tod B. [email protected]:
I wrote a quickie benchmark. CPU speed and compile options will
certainly influence your results.
http://snippets.dzone.com/posts/show/6098
Hm, it seems line 13 and 18 are identical. Where’s the lazy quantifier?
Here’s what I’d consider a better benchmark, as it covers the
scenarios I was talking about, especially with situations where there
is a second potential end point (“b” in this case):
robert@fussel /cygdrive/c/Temp
$ cat l.rb
#!/bin/env ruby
require ‘benchmark’
REP = 1_000
LONG = 1_000
STRINGS = [
[“short match”, “ab”],
[“short mismatch”, “a”],
[“long match”, “a” * LONG + “b”],
[“long mismatch”, “a” * LONG],
[“short match double”, “abab”],
[“long match double”, “a” * LONG + “bb”],
[“long match double long”, “a” * LONG + “b” + “a” * LONG + “b”],
]
Benchmark.bmbm(6 + STRINGS.inject(0) {|m,(a,b)| a.length > m ?
a.length : m }) do |b|
STRINGS.each do |label, str|
rep = /long mis/ =~ label ? 100 : 100_000
b.report "neg " + label do
rep.times { /a[^b]*b/ =~ str }
end
b.report "lazy " + label do
rep.times { /a.*?b/ =~ str }
end
end
end
robert@fussel /cygdrive/c/Temp
$ ./l.rb
Rehearsal
neg short match 0.282000 0.000000 0.282000 (
0.288000)
lazy short match 0.297000 0.000000 0.297000 (
0.284000)
neg short mismatch 0.328000 0.000000 0.328000 (
0.341000)
lazy short mismatch 0.375000 0.000000 0.375000 (
0.366000)
neg long match 9.531000 0.000000 9.531000 (
9.982000)
lazy long match 12.625000 0.000000 12.625000 (
12.764000)
neg long mismatch 4.672000 0.000000 4.672000 (
4.742000)
lazy long mismatch 6.297000 0.000000 6.297000 (
6.422000)
neg short match double 0.297000 0.000000 0.297000 (
0.291000)
lazy short match double 0.281000 0.000000 0.281000 (
0.287000)
neg long match double 9.406000 0.000000 9.406000 (
9.443000)
lazy long match double 12.500000 0.000000 12.500000 (
12.592000)
neg long match double long 9.516000 0.000000 9.516000 (
9.642000)
lazy long match double long 12.547000 0.000000 12.547000 (
12.745000)
----------------------------------------------------- total:
78.954000sec
user system total real
neg short match 0.312000 0.000000 0.312000 (
0.305000)
lazy short match 0.297000 0.000000 0.297000 (
0.301000)
neg short mismatch 0.375000 0.000000 0.375000 (
0.388000)
lazy short mismatch 0.359000 0.000000 0.359000 (
0.356000)
neg long match 9.344000 0.000000 9.344000 (
9.637000)
lazy long match 12.547000 0.000000 12.547000 (
12.777000)
neg long mismatch 4.703000 0.000000 4.703000 (
4.783000)
lazy long mismatch 6.219000 0.000000 6.219000 (
6.242000)
neg short match double 0.297000 0.000000 0.297000 (
0.301000)
lazy short match double 0.297000 0.000000 0.297000 (
0.297000)
neg long match double 9.453000 0.000000 9.453000 (
9.531000)
lazy long match double 12.718000 0.000000 12.718000 (
13.566000)
neg long match double long 9.407000 0.000000 9.407000 (
9.442000)
lazy long match double long 12.500000 0.000000 12.500000 (
12.777000)
robert@fussel /cygdrive/c/Temp
Notice how lazy is up to 30% slower for longer strings.
Also, best intro to regular expressions ever:
Regular Expression Tutorial - Learn How to Use Regular Expressions
Good ref!
Kind regards
robert