Multi-line regular expression match question

Hi,

I have been trying to create a solid regular expression to match a
possible multi-line expression without success. So after several hours i
almost got there but not the point i would like, hoping somebody can
point me in the right direction.
Here is an example i am dealing with:

01xxxxxxxxxxxxxx
|-:
<20>ABCD
<30>edfghi212

-

<20>EFGH
<30>hjkli3232
-|
89xxxxxxxxxxxxx

I need to match anything that is enclosed in between “|-:” and “-|”
So far i’ve got “/^{|-:$.*^-|$/m” , this one is greedy, returning the
complete set, instead of each match, i just haven’t figure out how to
make it reluctant enough to return one by one.

The returned matches expected must be something like this:

1:
|-:
<20>ABCD
<30>edfghi212
-|

2:
|-:
<20>EFGH
<30>hjkli3232
-|

Currently is returning:
1:

|-:
<20>ABCD
<30>edfghi212

-

<20>EFGH
<30>hjkli3232
-|

Any suggestion is greatly appreciated.
And finally, any good regular expressions book ?? =)

Cheers,
guillermo

On Fri, Nov 19, 2010 at 1:42 PM, Guillermo R.
[email protected] wrote:

<20>ABCD
complete set, instead of each match, i just haven’t figure out how to
2:
<30>edfghi212

-
<20>EFGH
<30>hjkli3232
-

Any suggestion is greatly appreciated.
And finally, any good regular expressions book ?? =)

irb(main):018:0> s =<<EOF
irb(main):019:0" 01xxxxxxxxxxxxxx
irb(main):020:0" |-:
irb(main):021:0" <20>ABCD
irb(main):022:0" <30>edfghi212
irb(main):023:0" -|
irb(main):024:0" |-:
irb(main):025:0" <20>EFGH
irb(main):026:0" <30>hjkli3232
irb(main):027:0" -|
irb(main):028:0" 89xxxxxxxxxxxxx
irb(main):029:0" EOF
=>
“01xxxxxxxxxxxxxx\n|-:\n<20>ABCD\n<30>edfghi212\n-|\n|-:\n<20>EFGH\n<30>hjkli3232\n-|\n89xxxxxxxxxxxxx\n”
irb(main):036:0> s.scan(/(|-:.*?-|)/m)
=> [[“|-:\n<20>ABCD\n<30>edfghi212\n-|”],
[“|-:\n<20>EFGH\n<30>hjkli3232\n-|”]]

Jesus.

On Fri, Nov 19, 2010 at 2:42 PM, Guillermo R.
[email protected] wrote:

I need to match anything that is enclosed in between “|-:” and “-|”
So far i’ve got “/^{|-:$.*^-|$/m” , this one is greedy, returning the
complete set, instead of each match, i just haven’t figure out how to
make it reluctant enough to return one by one.

If you’re using ruby 1.9 you can do us that, *? is the reluctant version
of *:

/^|-:$.*?^-|$/m

Note that I removed the ‘{’ from your original pattern. It is not
needed in this case.

If you’re on 1.8, one possibility is:

/^|-:$(?m:.*?)(?=^-|)^-|$/

But there are other, probably more efficient, ways to do it as well.

And finally, any good regular expressions book ?? =)

Jeffrey Friedl’s Mastering Regular Expressions is an excellent read
and covers regular expressions inside and out, literally. Here’s the
amazon link:

Mastering Regular Expressions, 3rd Edition [Book]

HTH,
Ammar

On Fri, Nov 19, 2010 at 3:50 PM, Ammar A. [email protected]
wrote:

If you’re using ruby 1.9 you can do us that, *? is the reluctant version of *:

/^|-:$.*?^-|$/m

—8<—

If you’re on 1.8, one possibility is:

/^|-:$(?m:.*?)(?=^-|)^-|$/

I was under the impression that the reluctant versions of the four
quantifiers was only available under ruby 1.9, but they are apparently
available under 1.8 as well. I used it in the example I showed for 1.8
without noticing.

Regards,
Ammar

“Jesús Gabriel y Galán” [email protected] wrote in post
#962571:

On Fri, Nov 19, 2010 at 1:42 PM, Guillermo R.
[email protected] wrote:

<20>ABCD
complete set, instead of each match, i just haven’t figure out how to
2:
<30>edfghi212

-
<20>EFGH
<30>hjkli3232
-

Any suggestion is greatly appreciated.
And finally, any good regular expressions book ?? =)

irb(main):018:0> s =<<EOF
irb(main):019:0" 01xxxxxxxxxxxxxx
irb(main):020:0" |-:
irb(main):021:0" <20>ABCD
irb(main):022:0" <30>edfghi212
irb(main):023:0" -|
irb(main):024:0" |-:
irb(main):025:0" <20>EFGH
irb(main):026:0" <30>hjkli3232
irb(main):027:0" -|
irb(main):028:0" 89xxxxxxxxxxxxx
irb(main):029:0" EOF
=>

“01xxxxxxxxxxxxxx\n|-:\n<20>ABCD\n<30>edfghi212\n-|\n|-:\n<20>EFGH\n<30>hjkli3232\n-|\n89xxxxxxxxxxxxx\n”

irb(main):036:0> s.scan(/(|-:.*?-|)/m)
=> [[“|-:\n<20>ABCD\n<30>edfghi212\n-|”],
[“|-:\n<20>EFGH\n<30>hjkli3232\n-|”]]

Jesus.

Muchas gracias =)
works like a charm
guillermo.

Ammar A. wrote in post #962576:

On Fri, Nov 19, 2010 at 3:50 PM, Ammar A. [email protected]
wrote:

If you’re using ruby 1.9 you can do us that, *? is the reluctant version of *:

/^|-:$.*?^-|$/m

—8<—

If you’re on 1.8, one possibility is:

/^|-:$(?m:.*?)(?=^-|)^-|$/

I was under the impression that the reluctant versions of the four
quantifiers was only available under ruby 1.9, but they are apparently
available under 1.8 as well. I used it in the example I showed for 1.8
without noticing.

Regards,
Ammar

Thanks for clarification Ammar, it works perfectly, and also for the
book recommendation very useful , thanks a lot

guillermo