Forum: Ruby on Rails Regular Expression Grouping

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
E53841892547e6c37931e821c13a1ee3?d=identicon&s=25 Marcelino Debajo (mdebajo)
on 2008-10-19 16:30
(Received via mailing list)
Hi!

I couldn't understand the behavior of this code:

match =  'Today is Feb 23rd, 2003'.match(/Feb 23(rd)?/)
a = match.to_a
puts a.size                #  2
puts a.join(",")           #  Feb 23rd,rd
puts a[0]                   #  Feb 23rd
puts a[1]                   #  rd

In my understanding, /Feb 23(rd)?/ is equivalent to /Feb 23|Feb
23rd/ . So, match should not include 'rd'.

Thanks.
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-10-19 16:35
(Received via mailing list)
On 19 Oct 2008, at 15:01, mars wrote:

> puts a[1]                   #  rd
>
> In my understanding, /Feb 23(rd)?/ is equivalent to /Feb 23|Feb
> 23rd/ . So, match should not include 'rd'.

? + and * are greedy, ie they always try to match as much of the
string as possible so rd is part of the match.
If you want a non greedy quantifier you need to add ? to it, for example
match =  'Today is Feb 23rd, 2003'.match(/Feb 23(rd)??/)
match[0] #=> "Feb 23"

Fred
E53841892547e6c37931e821c13a1ee3?d=identicon&s=25 Marcelino Debajo (mdebajo)
on 2008-10-19 17:05
(Received via mailing list)
Yeah you're right,

> match =  'Today is Feb 23rd, 2003'.match(/Feb 23(rd)??/)
> match[0] #=> "Feb 23"
> match[1] #=> nil

I expected:

> match =  'Today is Feb 23rd, 2003'.match(/Feb 23(rd)?/)  #  ? is greedy here
> match[0] #=> "Feb 23rd"
> match[1] #=> nil

But what I got in ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
is:
> match[0] #=> "Feb 23rd"
> match[1] #=> 'rd'

The behavior of '?' being greedy is correct since it matched "Feb
23rd" which is stored in match[0] . But should match[1] not be nil?
The regular expression does not match "rd" alone.

thanks



On Sun, Oct 19, 2008 at 11:34 PM, Frederick Cheung
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-10-19 17:15
(Received via mailing list)
On 19 Oct 2008, at 16:04, Marcelino Debajo wrote:

>> greedy here
> The regular expression does not match "rd" alone.
match[1] is the first group, which in this example should be
'rd' (unless you're talking about something else).

Fred
E53841892547e6c37931e821c13a1ee3?d=identicon&s=25 Marcelino Debajo (mdebajo)
on 2008-10-19 17:21
(Received via mailing list)
thanks Fred. I think I misunderstood something.

On Mon, Oct 20, 2008 at 12:14 AM, Frederick Cheung
This topic is locked and can not be replied to.