I have the following code snippet:
require 'net/http'
begin
hdoc =
Net::HTTP.get(URI.parse(‘Symbol Lookup from Yahoo Finance’))
re = /<TD>(.*)</TD>/
if hdoc =~ re
print "#{$&}\n"
else
print "Nothing\n"
end
end
The regular expression is never matched when I use the code as shown
above (the expression for re is just a simple one for my testing).
However, if I replace the variable name hdoc by a string like
“TestTest1”, the regular expression is matched. The type of
hdoc is String. What is wrong with the snippet above. I even tried to
replace hdoc by hdoc.to_s and it still doesn’t work.
Thanks for your help!
Charles
Hi,
On Sat, Oct 14, 2006 at 02:55:10AM +0900, [email protected]
wrote:
else
Thanks for your help!
It looks like there are no upper case “TD” tags in the page that you are
fetching. Try this instead:
begin
hdoc =
Net::HTTP.get(URI.parse(‘Symbol Lookup from Yahoo Finance’))
re = /<TD>(.*)<\/TD>/i
if hdoc =~ re
print "#{$&}\n"
else
print "Nothing\n"
end
end
Your regular expression was case sensitive, I changed it to be case
insensitive by adding the “i” switch.
On Oct 13, 2006, at 1:55 PM, [email protected] wrote:
print "Nothing\n"
end
end
Whrn I substitute ‘/TD’ for ‘/TD’ and make the regex case
insensitive, I get a match. See below:
! /usr/bin/env ruby -w
require 'net/http'
hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
s=Dupont&t=S&m=US'))
re = /(.*)<\/TD>/i ### note changes
if hdoc =~ re
puts "#{$&}\n"
else
puts "Nothing\n"
end
|
|
|
|
|
2 results for 'Dupont' (type=Stocks,
market=U.S. & Canada)
Regards, Morton
|
Morton G. wrote:
print "#{$&}\n"
require ‘net/http’
name=“t”> Stocks <option
Regards, Morton
Morton, Aaron,
You are both right, thanks a lot! I also added “m” at the end of the
regular expression to match whatever might span two lines.
Cheers!
Charles
re = /<TD>(.*)</TD>/
Use a HTML parser? Hpricot considered sexy recently.
David V.