Find a button using mechanize

Hi all,

I wonder how I find a button in a page(buttons
#<WWW::Mechanize::Form::Button:0x3a69908 @name=nil, @value=“Go”>})
with mechanize.

Thanks,

Li

Li Chen wrote:

Hi all,

I wonder how I find a button in a page(buttons
#<WWW::Mechanize::Form::Button:0x3a69908 @name=nil, @value=“Go”>})
with mechanize.

Thanks,

Li

Well , you should iterate through the fields of each form :

my_form = nil

catch(:FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == “Go”)
my_form = form
throw FoundForm
end
end
end
end

and now you would have the form containing the button , and could
operate on it.

Lex W. wrote:

Li Chen wrote:

Hi all,

I wonder how I find a button in a page(buttons
#<WWW::Mechanize::Form::Button:0x3a69908 @name=nil, @value=“Go”>})
with mechanize.

Thanks,

Li

Well , you should iterate through the fields of each form :

my_form = nil

catch(:FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == “Go”)
my_form = form
throw FoundForm
end
end
end
end

and now you would have the form containing the button , and could
operate on it.

I’m sorry , instead of :
throw FoundForm
you should have :
throw :FoundForm

So sorry about that.

Lex W. wrote:

my_form = nil

catch(:FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == “Go”)
my_form = form
throw FoundForm
end
end
end
end

Hi Lex,
I follow your script but it prints out nothing. I guess the inner loop
is not right. Accroding to this line :{buttons
#<WWW::Mechanize::Form::Button:0x3a6702c @name=nil, @value=“Go”>}

I change the inner loop.I get all the buttons and I am able to set the
value for @name or @value. But I still have a problem: I can’t print
the whole line out as
{buttons
#<WWW::Mechanize::Form::Button:0x3a6702c @name=nil, @value=“Go”>}

I only print out the some part of it as
#WWW::Mechanize::Form::Button:0x3a6702c

BTW: where can I find some info about relationship among
form/fields/buttons and other backgrounds ?

Thanks,

Li

######################
page.forms.each do |form|
form.buttons.each do |button|
if(button.value == ‘Go’)
puts button.name
puts button.value
button.name=‘1’
puts button.name
puts button
end
end
end

###output

nil
Go
1
#WWW::Mechanize::Form::Button:0x3ab4868
nil
Go
1
#WWW::Mechanize::Form::Button:0x3a9ac24
nil
Go
1
#WWW::Mechanize::Form::Button:0x3a807ac
nil
Go
1
#WWW::Mechanize::Form::Button:0x3a6702c

Exit code: 0

Could you please post the link to the site so that I might see where the
script is going wrong ?

Li Chen wrote:

Lex W. wrote:

my_form = nil

catch(:FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == “Go”)
my_form = form
throw FoundForm
end
end
end
end

You must have missed my update on that method , if you’re using the code
you pasted above . It should throw :FoundForm , instead of FoundForm .
Like this:

my_form = nil

catch(:FoundForm) do
agent.page.forms.each do |form|
form.fields.each do |field|
if(field.value == “Go”)
my_form = form
throw :FoundForm
end
end
end
end

Lex W. wrote:

You must have missed my update on that method , if you’re using the code
you pasted above . It should throw :FoundForm , instead of FoundForm .
Like this:

Lex,

No I don’t. I try both of your scripts but they don’t work. For me
exception handling is not a priority and without it Ruby script still
works very well.(Not like Java…).

Here is the webpage:
http://www.ask.com/web?qsrc=2352&o=0&l=dir&dm=&q=dictionary

I try to 1) type a word such as “abacus”
2) click the “Go” button.(There are for “Go” button but I am
only interested in the first one)
3) retrieve the definition( I might use Hpricot to do that)

Thanks for the follow-up,

Li

Li , I wouldn’t try to find a form by searching after it’s button .
Rather , try to search after the name of the input field associated to
the button . Here is how I automated
dictionary, Ask.com :

require “rubygems”
require “mechanize”

mech = WWW::Mechanize.new
mech.user_agent_alias = “Windows IE 6”
mech.get(“dictionary, Ask.com”)

form = nil

form = mech.page.forms.select {|form| form.has_field?(“aj_text1”)}.first

if(form == nil)
abort “could not find form”
end

form.aj_text1 = “abacus”
new_page = mech.submit(form)
puts new_page.body

Hi Lex,

Thank you very much.
Now the script is working. But I want to use hpricot to extract some 

info from the retrieved page. How can I do that?

Li

Lex W. wrote:

Li , please post examples . What is it you want to extract ? From what
page ? Why hpricot ( i assume you want it for xpath , but I can’t be
sure ) ?

I need to 1) extract the definition of ‘abucus’(see the following) from
this page:

abacus (n.) A manual computing device consisting of a frame holding
parallel rods strung with movable counters.
abacus (n.) A slab on the top of the capital of a column.

2)download the ‘wav’ file for this word and save it to my computer(so
that I can play it later).

Since I have a little bit experience with Hpricot I think using Hpricot
might help me extract the definition of ‘abacus’. But I am not sure if
Mechanize can do the same thing. This is the reason I want to try
Hpricot.

Thanks,

Li

Li , please post examples . What is it you want to extract ? From what
page ? Why hpricot ( i assume you want it for xpath , but I can’t be
sure ) ?

Li , here is the script that downloads the wav file . It’s kinda late
here , and I wasn’t really in the mood of extracting definitions right
now . Maybe tommorrow . Here’s the code :

require “rubygems”
require “mechanize”

mech = WWW::Mechanize.new
mech.user_agent_alias = “Windows IE 6”
mech.get(“dictionary, Ask.com”)

form = nil

form = mech.page.forms.select {|form| form.has_field?(“aj_text1”)}.first

if(form == nil)
abort “could not find form”
end

form.aj_text1 = “abacus”
new_page = mech.submit(form)

wav_link = mech.page.links.select {|link| link.href =~/.wav$/i}.first
puts “downloading #{wav_link.href}”
mech.get(wav_link).save_as(File.basename(wav_link.href))

Hi Lex,

Thank you very much for the codes.

Li

On Wed, Sep 10, 2008 at 9:58 AM, Li Chen [email protected] wrote:

Hi Lex,

I run your script and download a wav file of ‘abacus’. I find that I
cannot play the wav file with WMP. But I can play play it if it is
downloaded directly from IE browser. This is bizarre. I wonder how to
fix it.

Compare the two files and see if they are the same. The site might
have some sort of referrer-based protection against automated
downloads.

martin

Hi Lex,

I run your script and download a wav file of ‘abacus’. I find that I
cannot play the wav file with WMP. But I can play play it if it is
downloaded directly from IE browser. This is bizarre. I wonder how to
fix it.

Thanks,

Li

Martin DeMello wrote:

On Wed, Sep 10, 2008 at 9:58 AM, Li Chen [email protected] wrote:

Hi Lex,

I run your script and download a wav file of ‘abacus’. I find that I
cannot play the wav file with WMP. But I can play play it if it is
downloaded directly from IE browser. This is bizarre. I wonder how to
fix it.

Compare the two files and see if they are the same. The site might
have some sort of referrer-based protection against automated
downloads.

martin

The file downloaded directly is about 9 kb and the one using script is
44 kb.

Li

On Wed, Sep 10, 2008 at 10:14 AM, Li Chen [email protected] wrote:

The file downloaded directly is about 9 kb and the one using script is
44 kb.

It’s probably not the wav file, then. Does opening it up in notepad
reveal anything?

martin

Martin DeMello wrote:

It’s probably not the wav file, then. Does opening it up in notepad
reveal anything?

martin

The one downloaded directly is a binary file and the one using script is
a HTML page.

Since I use the following script( by Lex) to download the file I wonder
how to fix it.

Thanks,

Li

##################
require “rubygems”
require “mechanize”

mech = WWW::Mechanize.new
mech.user_agent_alias = “Windows IE 6”
mech.get(“dictionary, Ask.com”)

form = nil

form = mech.page.forms.select {|form| form.has_field?(“aj_text1”)}.first

if(form == nil)
abort “could not find form”
end

form.aj_text1 = “abacus”
new_page = mech.submit(form)

wav_link = mech.page.links.select {|link| link.href =~/.wav$/i}.first
puts “downloading #{wav_link.href}”
mech.get(wav_link).save_as(File.basename(wav_link.href))

On Wed, Sep 10, 2008 at 10:32 AM, Li Chen [email protected] wrote:

puts “downloading #{wav_link.href}”
mech.get(wav_link).save_as(File.basename(wav_link.href))

wgetting the href displayed works, so I’m guessing it’s the call to
mech.get that’s failing. there’s probably a cleaner way to do it via
mechanize, but this works:

File.open(File.basename(wav_link.href), ‘w’) {|f|
f.puts(mech.get_file(wav_link.href))}

martin

Martin DeMello wrote:

On Wed, Sep 10, 2008 at 10:32 AM, Li Chen [email protected] wrote:

puts “downloading #{wav_link.href}”
mech.get(wav_link).save_as(File.basename(wav_link.href))

wgetting the href displayed works, so I’m guessing it’s the call to
mech.get that’s failing. there’s probably a cleaner way to do it via
mechanize, but this works:

File.open(File.basename(wav_link.href), ‘w’) {|f|
f.puts(mech.get_file(wav_link.href))}

martin

Hi Martin,

Now the wav file can be played properly.

I have another question: the file downloaded via script is 9.36 kb and
the one from browser is 9.35kb. What causes the discrepancy?

Another question: I also need to retrieve the definition of the word
corresponding to the wav file from the same page. Can Mechanize do that?
I plan to use Hpricot to extract the info. I wonder if you or others
have any suggestions.

Thank you very much,

Li