Hi,
I’ve some more regex questions. I wrote a pattern to check for valid
regexes and inspect the parts (we all have our reasons for the things we
do:) It wasn’t working so I went down to simpler and simpler patterns,
but I’m a bit surprised at the way Ruby 1.9 is handling the regexes. I
tested the same pattern in Perl and it came out with the answers I’d
expect.
Is this down to me using perl regexes for so long, or is there something
I’m missing about Ruby’s implementation? It appears ^ at the beginning
of a string doesn’t bind as strongly as I’d expect.
I believe this test should fail as should be bound to the
beginning of the string by the ^ , and the match result is a little bit
crazy - shouldn’t the main capture be “d\d” if it’s following the
logical route it’s chosen?
$ ruby -e ’
md =
/^(?m)?(?.)(?.+?)\g/.match( %q!/\d\d\d! )
puts md.inspect
’
#<MatchData “/\d” mors:nil delim:“d” pat:"\">
Here I add on a trailing slash to the string, and (I believe) it should
bring me back what’s between the / / :
$ ruby -e ’
md =
/^(?m)?(?.)(?.+?)\g/.match( %q!/\d\d\d/! )
puts md.inspect
’
#<MatchData “/\d” mors:nil delim:“d” pat:"\">
Here’s the first string in perl 5.12 :
$ perl -e ’
if ( q(/\d\d\d) =~ /^(?m)?(?.)(?.+?)\g{delim}/ ) {
while ( my ($key, $value) = each(%+) ) {
print “$key => $value\n”;
}
}
’
<nothing here, what I’d expect>
And here it is with the “valid” string:
$ perl -e ’
if ( q(/\d\d\d/) =~ /^(?m)?(?.)(?.+?)\g{delim}/ ) {
while ( my ($key, $value) = each(%+) ) {
print “$key => $value\n”;
}
}
’
pat => \d\d\d
delim => /
These are the answers I’d expect.
Even this seems unexpected to me, if I remove the then surely ^
should bind to the beginning???
$ ruby -e ’
md = /^(?.)(?.+?)\g/.match(
%q!/\d\d\d/! )
puts md.inspect
’
#<MatchData “/\d” delim:“d” pat:"\">
These work as I’d expect by using the end of line $ :
$ ruby -e ’
md = /^(?.)(?.+?)\g$/.match(
%q!/\d\d\d/! )
puts md.inspect
’
#<MatchData “/\d\d\d/” delim:"/" pat:"\d\d\d">
$ ruby -e ’
md =
/^(?m)?(?.)(?.+?)\g$/.match( %q!/\d\d\d/! )
puts md.inspect
’
#<MatchData “/\d\d\d/” mors:nil delim:"/" pat:"\d\d\d">
And finally, if I remove the caret but leave the $ I get the answer I’d
expect (or that I’m looking for) :
$ ruby -e ’
md =
/(?m)?(?.)(?.+?)\g$/.match( %q!/\d\d\d/! )
puts md.inspect
’
#<MatchData “/\d\d\d/” mors:nil delim:"/" pat:"\d\d\d">
Regards,
Iain