Ruby multiline regex problem

Code:

testPit
something here Best

Pattern:

<td.?>.?</td\s*>

I’m trying to match this whole block and use it for further parsing.
This started from an example in Brian Merick’s book “Everyday
Scripting…” that had to be modified because amazon has changed their
presentation to tables instead of lists.

Anyway, the regex works fine as a single-line. as soon as I introduce
this:

"

testPit
something here

Best

"

it fails.

When I try this same expression with perl using the //s mode, it works.
I understand Ruby uses //m (multi-line mode in nearly the same fashion
causing newlines to be considered any character, so it should work,
right? Can anyone tell me what I am doing wrong here? Why isn’t
“multiline” mode working?

Thanks!

On Tue, Apr 8, 2008 at 11:21 AM, Gregg Y. [email protected] wrote:

href=“http://www.amazon.com/Rails-Recipes/dp/0977616606/ref=pd_sim_b_njs_img_1”>testPit
“multiline” mode working?

Thanks!

s = '

testPit
something here

Best’

puts “######\ns:”
puts s

r1 = /<td.?>.?</td.?>/m
r2 = /<td.
?>(.?)</td.?>/m

puts “######\nscan with r1:”
puts s.scan(r1)
puts
puts “######\nmatch with r1:”
puts (s.match r1)[0]
puts

s =~ r1
puts “######\n=~ and $1 with r1:”
puts $1

puts
puts
puts

puts “######\nscan with r2:”
puts s.scan(r2)
puts
puts “######\nmatch with r2:”
puts (s.match r2)[0]
puts

s =~ r2
puts “######\n=~ and $1 with r2:”
puts $1

Hmm, I’m not sure if the regexp /<td[^>]>.?</td[^>]*>/m would be
more appropriate or not.

Todd

2008/4/8, Gregg Y. [email protected]:

href=“Amazon.com”>testPit
“multiline” mode working?
Works for me: no match without /m, match with /m:

irb(main):004:0> s=%q{

<a
irb(main):005:0’
href=“Amazon.com”>testPit
irb(main):006:0’ something here Best}
=> “<td align="left" ><div style="width: 165px; height:
175px;"><a\nhref="http://www.amazon.com/Rails-Recipes/dp/09
77616606/ref=pd_sim_b_njs_img_1">testPit\nsomething here Best”
irb(main):007:0> s[%r{<td.?</td\s>}]
=> nil
irb(main):008:0> s[%r{<td.?</td\s>}m]
=> “<td align="left" ><div style="width: 165px; height:
175px;"><a\nhref="http://www.amazon.com/Rails-Recipes/dp/09
77616606/ref=pd_sim_b_njs_img_1">testPit\nsomething here Best”
irb(main):009:0>

Cheers

robert

2008/4/10, Ransom T. [email protected]:

Thanks folks for all your help…turns out that I was using the regex
test view in Eclipse (RDT) which was obviously not behaving properly in
multi-line mode. I guess I need to go out and get the Aptana/Radrails
plugin that has the latest RDT and ruby-debug built in. I identified the
issue using Mike Lovitt’s Rubular regex tester. Thanks Mike for
restarting that server!

Why look so far? IRB serves the same purpose.

Cheers

robert

Thanks folks for all your help…turns out that I was using the regex
test view in Eclipse (RDT) which was obviously not behaving properly in
multi-line mode. I guess I need to go out and get the Aptana/Radrails
plugin that has the latest RDT and ruby-debug built in. I identified the
issue using Mike Lovitt’s Rubular regex tester. Thanks Mike for
restarting that server!

Robert K. wrote:

2008/4/10, Ransom T. [email protected]:

Thanks folks for all your help…turns out that I was using the regex
test view in Eclipse (RDT) which was obviously not behaving properly in
multi-line mode. I guess I need to go out and get the Aptana/Radrails
plugin that has the latest RDT and ruby-debug built in. I identified the
issue using Mike Lovitt’s Rubular regex tester. Thanks Mike for
restarting that server!

Why look so far? IRB serves the same purpose.

Cheers

robert

I’m a newb with Ruby and IRB. I did test the regex in IRB, but did not
know that I could set a literal string up with \n characters like you
did above through the interface. So, of course, it was passing
everytime. That is very cool! I am growing fonder of IRB every day…