Regex: get the first match

Trochalakis_C · June 10, 2007, 12:52pm

Hello!

I want to parse a tagged string like this: “this ismy
string”

i am doing:

“this ismy string”.scan(/(.*)/)
=> [[“this ismy string”]]

What i want is a regex that will return the first segment that
matches.
in the above case -> [[“this is”, “my string”]]

Is there any way to do this?

Thanks!

Trochalakis_C · June 10, 2007, 2:09pm

On 6/10/07, Trochalakis C. [email protected] wrote:

What i want is a regex that will return the first segment that
matches.
in the above case → [[“this is”, “my string”]]

Is there any way to do this?

Thanks!

This is a FAQ, and yes I will give the solution
Regexps are gready par default, they consume as many chars as
possible, there are some possibilities - not tested:

(1) use non gready matches
“this ismy string”.scan(/(.?)/)
(2) use less general expressions
“this ismy string”.scan(/(.[^<])/)
(3) Combine both
“this ismy string”.scan(/(.[^<]*?)/)

HTH
Robert

P.S.
This really is a FAQ though

Trochalakis_C · June 10, 2007, 2:22pm

On 6/10/07, Robert D. [email protected] wrote:

=> [[“this ismy string”]]

This is a FAQ, and yes I will give the solution
Regexps are gready par default, they consume as many chars as
possible, there are some possibilities - not tested:

(1) use non gready matches
“this ismy string”.scan(/(.?)/)
(2) use less general expressions
“this ismy string”.scan(/(.[^<])/)
(3) Combine both
“this ismy string”.scan(/(.[^<]*?)/)

.Unless you want to match strings like <foo, it would be simple
to
just use [^<], and not .[^<]. .[^<]* will also not match . If
the
intent was to make the regexp not match that, a better regexp would be
[^<]+

HTH

Trochalakis_C · June 10, 2007, 2:26pm

in the above case -> [[“this is”, “my string”]]
The solution is :

“this ismy string”.scan(/(.*?)/)
=> [[“this is”], [“my string”]]

The regexp scope is default maximum as is possible to find.
If you use ‘?’ character you minimze the scope.
(.?) instead of (.) and the part of string don’t be include
into one result.

Regards,
Grzegorz Golebiowski

Trochalakis_C · June 10, 2007, 7:20pm

On Jun 10, 3:22 pm, GrzechG [email protected] wrote:

in the above case → [[“this is”, “my string”]]

Regards,
Grzegorz Golebiowski

Thanks Grzegorz, nice trick!

Trochalakis_C · June 10, 2007, 7:31pm

On 6/10/07, Trochalakis C. [email protected] wrote:

matches.
into one result.

Regards,
Grzegorz Golebiowski

Thanks Grzegorz, nice trick!

You are welcome
Robert

Trochalakis_C · June 10, 2007, 2:46pm

On 6/10/07, Logan C. [email protected] wrote:

“this ismy string”.scan(/(.*)/)

“this ismy string”.scan(/(.[^<]*?)/)

.Unless you want to match strings like <foo, it would be simple to
just use [^<], and not .[^<]. .[^<]* will also not match . If the
intent was to make the regexp not match that, a better regexp would be [^<]+
Thanks for correcting my typos.
Robert