Hi
I’ve been reading about ruby and started to learn it to replace the work
I was doing with perl. I have a question about the code from this link.
http://www.adaruby.com/2008/01/11/scraping-gmail-with-mechanize-and-hpricot/
I don’t understand this code. I can see that it is a block that returns
array of entries containing the html for every tr with a white
background
#################################
page.search(“//tr[@bgcolor=‘#ffffff’]”) do |row|
from, subject = row.search(“//b/text()”)
url = page.uri.to_s.sub(/ui.$/,
row.search(“//a”).first.attributes[“href”])
puts “From: #{from}\nSubject: #{subject}\nLink: #{url}\n\n”
email = agent.get url
##################################
But what does the from, subject = *row.search(“//b/text()”) do?
How is the *row different than row?
Finally what does this do? I can see a regex but don’t understand the
line.
url = page.uri.to_s.sub(/ui.*$/,
row.search(“//a”).first.attributes[“href”])
Thanks in advance for your help.
Regards Richard
#################################### FULL CODE
#############################
require ‘rubygems’
require ‘mechanize’
agent = WWW::Mechanize.new
page = agent.get ‘http://www.gmail.com’
form = page.forms.first
form.Email = ‘your gmail account’
form.Passwd = ‘your password’
page = agent.submit form
page = agent.get
page.search(“//meta”).first.attributes[‘href’].gsub(/‘/,’‘)
page = agent.get page.uri.to_s.sub(/?.*$/, “?ui=html&zy=n”)
page.search("//tr[@bgcolor=’#ffffff’]“) do |row|
from, subject = row.search(“//b/text()”)
url = page.uri.to_s.sub(/ui.$/,
row.search(”//a").first.attributes[“href”])
puts “From: #{from}\nSubject: #{subject}\nLink: #{url}\n\n”
email = agent.get url
…
end