Regex: greedy pattern


#1

Hello,

In the code below, the pattern /#{a}/ consumes more than what I’d
expect it to:

irb(main):338:0> abbr = %w[Mr. Dr. i.e. Prof.]
text = “Mr. Drake and Dr. Hide, i.e., Mr. Dride, I presume?”
abbr.each do |a|
abbrNoDot = a.gsub(/./,"")
text.gsub!(/#{a}/,abbrNoDot)
end
puts text
=> [“Mr.”, “Dr.”, “ave.”, “st.”, “i.e.”, “Prof.”]
=> “Mr. Drake and Dr. Hide, i.e., Mr. Dride, I presume?”
=> [“Mr.”, “Dr.”, “ave.”, “st.”, “i.e.”, “Prof.”]
Mr Drke and Dr Hie ie, Mr Drde, I presume?

Thus “Drake” and "Dride satisfy the pattern “Dr”, becoming “Drke” and
“Drde”. I’ve tried many variations on the pattern /#{a}/, but I just
don’t have enough Regex in my mind yet.

Thanks for the help,
basi


#2

On Dec 8, 2005, at 11:37 PM, basi wrote:

end
Thanks for the help,
basi

/Dr\b/


#3

basi wrote:

end
Thanks for the help,
basi

You need to escape the dot which is matching any character.

abbr = %w[Mr. Dr. i.e. Prof.]
text = “Mr. Drake and Dr. Hide, i.e., Mr. Dride, I presume?”
abbr.each do |a|
text.gsub!(/#{Regexp.escape(a)}/, a.gsub(/./, ‘’))
end
puts text
#=> Mr Drake and Dr Hide, ie, Mr Dride, I presume?

daz


#4

Hi,

At Fri, 9 Dec 2005 13:37:35 +0900,
basi wrote in [ruby-talk:169793]:

Thus “Drake” and "Dride satisfy the pattern “Dr”, becoming “Drke” and
“Drde”. I’ve tried many variations on the pattern /#{a}/, but I just
don’t have enough Regex in my mind yet.

It’s not caused by greediness. You have to escape Regexp meta
characters.

/#{Regexp.quote(a)}/


#5

It works!
So a metacharacter inside a pattern that is referenced by a variable is
still a metacharacter. How obvious it is now that it’s been pointed out
to me.

Thanks to all who responded,
basi