The Mechanize documentation says to just start scraping with Nokogiri
once you’ve navigated to the right page with Mechanize, but this example
doesn’t seem to work for me:
agent = Mechanize.new
page = agent.get(‘http://google.com/’)
google_form = page.form(‘f’)
google_form.q = ‘ruby mechanize’
page = agent.submit(google_form)
page.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
But this works:
page =
Nokogiri::HTML(open(‘ruby mechanize - Google Search’))
page.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
Any advice?
Is there a form with the name ‘f’ on the page www.google.com? According
to the mechanize instructions, you were supposed to do this:
pp page
to identify the name of the form and the name of the form field you
want to fill in.
The Mechanize part works fine. I navigate to the page I want and when I
pretty print I get the page I want. The problem is I don’t know how to
scrape data from the page with Nokogiri once I’ve navigated to it. For
example, this is how you would do it with pure Nokogiri:
page =
Nokogiri::HTML(open(‘ruby mechanize - Google Search’))
page.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
But when I do it with Mechanize, it doesn’t output anything.
agent = Mechanize.new
page = agent.get(‘http://google.com/’)
google_form = page.form(‘f’)
google_form.q = ‘ruby mechanize’
page = agent.submit(google_form)
page.xpath(‘//h3/a[@class=“l”]’).each do |link|
puts link.content
end
It’s a trivial example, but for the data I’m scraping I need to use
Mechanize to navigate to the page, so I can’t just use Nokogiri. The
thing is the Mechanize documentation says, I should be able to use
regular Nokogiri methods, but they don’t seem to work for me.