Hi,
I have been trying to create a solid regular expression to match a
possible multi-line expression without success. So after several hours i
almost got there but not the point i would like, hoping somebody can
point me in the right direction.
Here is an example i am dealing with:
01xxxxxxxxxxxxxx
|-:
<20>ABCD
<30>edfghi212
<20>EFGH
<30>hjkli3232
-|
89xxxxxxxxxxxxx
I need to match anything that is enclosed in between “|-:” and “-|”
So far i’ve got “/^{|-:$.*^-|$/m” , this one is greedy, returning the
complete set, instead of each match, i just haven’t figure out how to
make it reluctant enough to return one by one.
The returned matches expected must be something like this:
1:
|-:
<20>ABCD
<30>edfghi212
-|
2:
|-:
<20>EFGH
<30>hjkli3232
-|
Currently is returning:
1:
|-:
<20>ABCD
<30>edfghi212
<20>EFGH
<30>hjkli3232
-|
Any suggestion is greatly appreciated.
And finally, any good regular expressions book ?? =)
Cheers,
guillermo
On Fri, Nov 19, 2010 at 1:42 PM, Guillermo R.
[email protected] wrote:
<20>ABCD
complete set, instead of each match, i just haven’t figure out how to
2:
<30>edfghi212
- |
<20>EFGH |
<30>hjkli3232 |
- |
Any suggestion is greatly appreciated.
And finally, any good regular expressions book ?? =)
irb(main):018:0> s =<<EOF
irb(main):019:0" 01xxxxxxxxxxxxxx
irb(main):020:0" |-:
irb(main):021:0" <20>ABCD
irb(main):022:0" <30>edfghi212
irb(main):023:0" -|
irb(main):024:0" |-:
irb(main):025:0" <20>EFGH
irb(main):026:0" <30>hjkli3232
irb(main):027:0" -|
irb(main):028:0" 89xxxxxxxxxxxxx
irb(main):029:0" EOF
=>
“01xxxxxxxxxxxxxx\n|-:\n<20>ABCD\n<30>edfghi212\n-|\n|-:\n<20>EFGH\n<30>hjkli3232\n-|\n89xxxxxxxxxxxxx\n”
irb(main):036:0> s.scan(/(|-:.*?-|)/m)
=> [[“|-:\n<20>ABCD\n<30>edfghi212\n-|”],
[“|-:\n<20>EFGH\n<30>hjkli3232\n-|”]]
Jesus.
On Fri, Nov 19, 2010 at 2:42 PM, Guillermo R.
[email protected] wrote:
I need to match anything that is enclosed in between “|-:” and “-|”
So far i’ve got “/^{|-:$.*^-|$/m” , this one is greedy, returning the
complete set, instead of each match, i just haven’t figure out how to
make it reluctant enough to return one by one.
If you’re using ruby 1.9 you can do us that, *? is the reluctant version
of *:
/^|-:$.*?^-|$/m
Note that I removed the ‘{’ from your original pattern. It is not
needed in this case.
If you’re on 1.8, one possibility is:
/^|-:$(?m:.*?)(?=^-|)^-|$/
But there are other, probably more efficient, ways to do it as well.
And finally, any good regular expressions book ?? =)
Jeffrey Friedl’s Mastering Regular Expressions is an excellent read
and covers regular expressions inside and out, literally. Here’s the
amazon link:
Mastering Regular Expressions, 3rd Edition [Book]
HTH,
Ammar
On Fri, Nov 19, 2010 at 3:50 PM, Ammar A. [email protected]
wrote:
If you’re using ruby 1.9 you can do us that, *? is the reluctant version of *:
/^|-:$.*?^-|$/m
—8<—
If you’re on 1.8, one possibility is:
/^|-:$(?m:.*?)(?=^-|)^-|$/
I was under the impression that the reluctant versions of the four
quantifiers was only available under ruby 1.9, but they are apparently
available under 1.8 as well. I used it in the example I showed for 1.8
without noticing.
Regards,
Ammar
“Jesús Gabriel y Galán” [email protected] wrote in post
#962571:
On Fri, Nov 19, 2010 at 1:42 PM, Guillermo R.
[email protected] wrote:
<20>ABCD
complete set, instead of each match, i just haven’t figure out how to
2:
<30>edfghi212
- |
<20>EFGH |
<30>hjkli3232 |
- |
Any suggestion is greatly appreciated.
And finally, any good regular expressions book ?? =)
irb(main):018:0> s =<<EOF
irb(main):019:0" 01xxxxxxxxxxxxxx
irb(main):020:0" |-:
irb(main):021:0" <20>ABCD
irb(main):022:0" <30>edfghi212
irb(main):023:0" -|
irb(main):024:0" |-:
irb(main):025:0" <20>EFGH
irb(main):026:0" <30>hjkli3232
irb(main):027:0" -|
irb(main):028:0" 89xxxxxxxxxxxxx
irb(main):029:0" EOF
=>
“01xxxxxxxxxxxxxx\n|-:\n<20>ABCD\n<30>edfghi212\n-|\n|-:\n<20>EFGH\n<30>hjkli3232\n-|\n89xxxxxxxxxxxxx\n”
irb(main):036:0> s.scan(/(|-:.*?-|)/m)
=> [[“|-:\n<20>ABCD\n<30>edfghi212\n-|”],
[“|-:\n<20>EFGH\n<30>hjkli3232\n-|”]]
Jesus.
Muchas gracias =)
works like a charm
guillermo.
Ammar A. wrote in post #962576:
On Fri, Nov 19, 2010 at 3:50 PM, Ammar A. [email protected]
wrote:
If you’re using ruby 1.9 you can do us that, *? is the reluctant version of *:
/^|-:$.*?^-|$/m
—8<—
If you’re on 1.8, one possibility is:
/^|-:$(?m:.*?)(?=^-|)^-|$/
I was under the impression that the reluctant versions of the four
quantifiers was only available under ruby 1.9, but they are apparently
available under 1.8 as well. I used it in the example I showed for 1.8
without noticing.
Regards,
Ammar
Thanks for clarification Ammar, it works perfectly, and also for the
book recommendation very useful , thanks a lot
guillermo