Problem with a regular expression

unknown · October 13, 2006, 7:57pm

I have the following code snippet:

require 'net/http'
     begin
          hdoc =

Net::HTTP.get(URI.parse(‘Symbol Lookup from Yahoo Finance’))

          re = /<TD>(.*)</TD>/
          if hdoc =~ re
               print "#{$&}\n"
          else
               print "Nothing\n"
          end
     end

The regular expression is never matched when I use the code as shown
above (the expression for re is just a simple one for my testing).
However, if I replace the variable name hdoc by a string like
“TestTest1”, the regular expression is matched. The type of
hdoc is String. What is wrong with the snippet above. I even tried to
replace hdoc by hdoc.to_s and it still doesn’t work.
Thanks for your help!

Charles

unknown · October 13, 2006, 8:19pm

Hi,

On Sat, Oct 14, 2006 at 02:55:10AM +0900, [email protected]
wrote:

          else
Thanks for your help!
It looks like there are no upper case “TD” tags in the page that you are
fetching. Try this instead:

begin
hdoc =
Net::HTTP.get(URI.parse(‘Symbol Lookup from Yahoo Finance’))

re = /<TD>(.*)<\/TD>/i
if hdoc =~ re
  print "#{$&}\n"
else
  print "Nothing\n"
end

end

Your regular expression was case sensitive, I changed it to be case
insensitive by adding the “i” switch.

unknown · October 13, 2006, 8:29pm

On Oct 13, 2006, at 1:55 PM, [email protected] wrote:

               print "Nothing\n"
          end
     end

Whrn I substitute ‘/TD’ for ‘/TD’ and make the regex case
insensitive, I get a match. See below:


! /usr/bin/env ruby -w
require 'net/http'
hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
s=Dupont&t=S&m=US'))
re = /(.*)<\/TD>/i ### note changes
if hdoc =~ re
    puts "#{$&}\n"
else
    puts "Nothing\n"
end



Symbol Lookup 






Name:
Type:
Market:




 Stocks  ETFs  Indices  Mutual Funds  Futures 
U.S. & CanadaWorld Market


View supported exchanges

2 results for 'Dupont' (type=Stocks,
market=U.S. & Canada)
Regards, Morton

unknown · October 13, 2006, 9:06pm

Morton G. wrote:

               print "#{$&}\n"
require ‘net/http’

name=“t”> Stocks <option

Regards, Morton

Morton, Aaron,

You are both right, thanks a lot! I also added “m” at the end of the
regular expression to match whatever might span two lines.
Cheers!

Charles

unknown · October 13, 2006, 9:00pm

          re = /<TD>(.*)</TD>/

Use a HTML parser? Hpricot considered sexy recently.

David V.

Name:	Type:	Market:
	Stocks ETFs Indices Mutual Funds Futures	U.S. & CanadaWorld Market
View supported exchanges