Regex: get the first match


#1

Hello!

I want to parse a tagged string like this: “this ismy
string

i am doing:

this ismy string”.scan(/(.*)</i>/)
=> [[“this is
my string”]]

What i want is a regex that will return the first segment that
matches.
in the above case -> [[“this is”, “my string”]]

Is there any way to do this?

Thanks!


#2

On 6/10/07, Trochalakis C. removed_email_address@domain.invalid wrote:

What i want is a regex that will return the first segment that
matches.
in the above case -> [[“this is”, “my string”]]

Is there any way to do this?

Thanks!

This is a FAQ, and yes I will give the solution :wink:
Regexps are gready par default, they consume as many chars as
possible, there are some possibilities - not tested:

(1) use non gready matches
this ismy string”.scan(/(.?)</i>/)
(2) use less general expressions
this ismy string”.scan(/(.[^<]
)</i>/)
(3) Combine both :wink:
this ismy string”.scan(/(.[^<]*?)</i>/)

HTH
Robert

P.S.
This really is a FAQ though


#3

On 6/10/07, Robert D. removed_email_address@domain.invalid wrote:

=> [[“this ismy string”]]

This is a FAQ, and yes I will give the solution :wink:
Regexps are gready par default, they consume as many chars as
possible, there are some possibilities - not tested:

(1) use non gready matches
this ismy string”.scan(/(.?)</i>/)
(2) use less general expressions
this ismy string”.scan(/(.[^<]
)</i>/)
(3) Combine both :wink:
this ismy string”.scan(/(.[^<]*?)</i>/)

.Unless you want to match strings like <foo, it would be simple
to
just use [^<], and not .[^<]. .[^<]* will also not match . If
the
intent was to make the regexp not match that, a better regexp would be
[^<]+

HTH


#4

in the above case -> [[“this is”, “my string”]]
The solution is :

this ismy string”.scan(/(.*?)</i>/)
=> [[“this is”], [“my string”]]

The regexp scope is default maximum as is possible to find.
If you use ‘?’ character you minimze the scope.
(.?) instead of (.) and the part of string don’t be include
into one result.

Regards,
Grzegorz Golebiowski


#5

On Jun 10, 3:22 pm, GrzechG removed_email_address@domain.invalid wrote:

in the above case -> [[“this is”, “my string”]]

Regards,
Grzegorz Golebiowski

Thanks Grzegorz, nice trick!


#6

On 6/10/07, Trochalakis C. removed_email_address@domain.invalid wrote:

matches.
into one result.

Regards,
Grzegorz Golebiowski

Thanks Grzegorz, nice trick!

You are welcome :wink:
Robert


#7

On 6/10/07, Logan C. removed_email_address@domain.invalid wrote:

this ismy string”.scan(/(.*)</i>/)

this ismy string”.scan(/(.[^<]*?)</i>/)

.Unless you want to match strings like <foo, it would be simple to
just use [^<], and not .[^<]. .[^<]* will also not match . If the
intent was to make the regexp not match that, a better regexp would be [^<]+
Thanks for correcting my typos.
Robert