Mechanize select list help...?


#1

Hi.

I’m using the excellent WWW::Mechanize to screen scrape a site for UK
frost dates (don’t ask :wink:

there’s a lot of issues with the HTML not being grand, so I thought
that’s where I am going wrong in my code, but I’d be really grateful if
somebody could give me a steer on this as I’ve been trying for hours,
and the documentation only gets me half-way :slight_smile:

Here’s the code. All I want to do is select each of the 100 or so towns
in the select list, follow the link via the submit button and scrape the
first and last frost dates from the resulting page.

Here’s the code:

require ‘rubygems’
require ‘mechanize’

agent = WWW::Mechanize.new
page = agent.get(‘http://www.gardenaction.co.uk/main/weather1.asp’)

town_results = page.form_with(:action => ‘create_cookie.asp’) do |e|
e.fields.name(‘Town’).options.each do |s|
s.select
end
end.submit

p town_results.search("/<p align=“left”>HOME TOWN:(.*)<Form
Method=Post Action=“create_cookie.asp”>/")

I think I’m actually getting as a result the page itself back not the
results page (which should be
http://www.gardenaction.co.uk/main/weather1-results.asp)

Can anyone give me some advice here? It should be obvious I’m new to
Ruby and OO so am fully expecting to have gone wrong here with instance
variables or the like :slight_smile:

thanks in advance.

andy


#2

On Oct 6, 5:02 pm, Andy P. removed_email_address@domain.invalid wrote:

Here’s the code. All I want to do is select each of the 100 or so towns

results page (which should behttp://www.gardenaction.co.uk/main/weather1-results.asp)

Can anyone give me some advice here? It should be obvious I’m new to
Ruby and OO so am fully expecting to have gone wrong here with instance
variables or the like :slight_smile:

I don’t think it’s the ruby; you need to think it through a bit more.
How many times will you need to submit the form? Once per town,
correct? Therefore, the submit and parse should be inside the loop.

Try this for starters:

agent = WWW::Mechanize.new
page = agent.get(‘http://www.gardenaction.co.uk/main/weather1.asp’)

form = page.form_with(:action => ‘create_cookie.asp’)
form.fields.name(‘Town’).options.each do |town|
form[‘Town’] = town
page2 = form.submit
puts page2.body
exit #remove when you’re ready to process them all
end


#3

Thanks for the help Mark. You’re right I needed to think it through a
bit more. Plus, I was unnecessarily using the select method.

Now I’ve got to find a way to grab the stuff from the proceeding pages
that I need…on to the docs again.

cheers, andy