Mechanize Timeout Exception - How To?

learningruby · February 25, 2008, 11:55am

Hi all,

     I am very new at scraping and using mechanize. Its all a smooth

run until I faced this problem of handling Timeout while fetching a web
page.

The Timeout::timeout() is unable to rescue this kind of errors. Here is
my code

require ‘timeout’
require ‘rubygems’
require ‘mechanize’

agent = WWW::Mechanize.new
begin
Timeout::timeout(10) do
agent.get(‘http://www.r-knowsys.com’) #this url doesn’t exist
end
rescue Timeout::Error
puts “timeout the page doesnt exist”
end
and when I run it, the error stack is as followed

/usr/lib/ruby/1.8/net/http.rb:560:in initialize': getaddrinfo: Name or service not known (SocketError) from /usr/lib/ruby/1.8/net/http.rb:560:in open’
from /usr/lib/ruby/1.8/net/http.rb:560:in connect' from /usr/lib/ruby/1.8/timeout.rb:48:in timeout’
from /usr/lib/ruby/1.8/timeout.rb:76:in timeout' from /usr/lib/ruby/1.8/net/http.rb:560:in connect’
from /usr/lib/ruby/1.8/net/http.rb:553:in do_start' from /usr/lib/ruby/1.8/net/http.rb:542:in start’
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:352:in
fetch_page' from /usr/lib/ruby/gems/1.8/gems/mechanize-0.6.4/lib/mechanize.rb:143:in get’
from test.rb:8
from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout’
from test.rb:7

Exit code: 1

How do I handle such a case? Any help appreciated.

regards,
venkat

learningruby · February 25, 2008, 12:02pm

Hi,

Indeed, the problem here isn’t that the request is timing out - it’s
failing
within 10 seconds. You need to catch a SocketError - it can’t find the
site
www.r-knowsys.com, so it fails with a SocketError instead.

Cheers,
Arlen.

learningruby · February 27, 2008, 9:22am

Hi Arlen,

Thanks a lot…That worked. Seems that I need to go into the diffs of
socket error/ timeout error…

Cheers…
Venkat