Ruby Sub Regular Expression


#1

Hello! I have a string: “Hello - 1 - World”

irb(main):017:0> a = “Hello - 1 - World”
=> “Hello - 1 - World”

I want to chop off the "Hello - " part. Hello could be any word, so I
want to match it generically.
To match the first minus I do this:

irb(main):018:0> b = a.sub(/\s-\s/, “”)
=> “Hello1 - World”

Good! Now to get rid of the first word I try this:

irb(main):019:0> c = a.sub(/.*\s-\s/, “”)
=> “World”

Bad! It matched to the second minus! Why does sub do this? I thought
it was supposed to match the first occurrence only.


#2

Hi. You missed out your non-greedy operator (if thats the right term) in
your regex.

Try:

a = “Hello - 1 - World”

c = a.sub(/.*?\s-\s/, “”)

irb(main):004:0> c = a.sub(/.*?\s-\s/, “”)
=> “1 - World”

Hope that helps
Jim


#3

On Wed, Nov 12, 2008 at 4:52 PM, Dave R. removed_email_address@domain.invalid
wrote:

=> “Hello1 - World”

Good! Now to get rid of the first word I try this:

irb(main):019:0> c = a.sub(/.*\s-\s/, “”)
=> “World”

Bad! It matched to the second minus! Why does sub do this? I thought
it was supposed to match the first occurrence only.

Because * by default is greedy, so it tries to match as much as it can.
Try this:

irb(main):002:0> c = a.sub(/.*?\s-\s/, “”)
=> “1 - World”

Jesus.


#4

On 12.11.2008 17:14, Jesús Gabriel y Galán wrote:

irb(main):018:0> b = a.sub(/\s-\s/, “”)
Because * by default is greedy, so it tries to match as much as it can.
Try this:

irb(main):002:0> c = a.sub(/.*?\s-\s/, “”)
=> “1 - World”

Other variants would be

irb(main):001:0> a = “Hello - 1 - World”
=> “Hello - 1 - World”
irb(main):002:0> a[/\d+\s±.*/]
=> “1 - World”
irb(main):003:0> a.sub /^\S+\s±\s+/, ‘’
=> “1 - World”
irb(main):004:0> a.sub /^\w+\s±\s+/, ‘’
=> “1 - World”
irb(main):005:0>

Cheers

robert


#5

Thank you Jim and Jesus


#6

On Wed, Nov 12, 2008 at 5:11 PM, Jim McKerchar
removed_email_address@domain.invalid wrote:

Hi. You missed out your non-greedy operator (if thats the right term)

“lazy”


#7

Thanks Henrik