Mechanize click()-Problem

kien · February 12, 2008, 12:54am

Hello,

I try to log on to a website using mechanize. I can submit my username
and password, but when I get redirected I have a problem.
I want to click on a link using following code:

link = page.links.href(/index.php?ahcd_obj=([a-z]|[0-9])*/)
page = agent.click(link)

(The RegExp is fine, I tested it manually)

It looks as if there is some hidden redirect or something like that,
because the result is always the same, that is I’m still on the same
page.

That’s one of those links:

Help is welcome.

kien · February 12, 2008, 8:36am

Toni Keller wrote:

It looks as if there is some hidden redirect or something like that,
because the result is always the same, that is I’m still on the same
page.

The server can accomplish a redirect by sending back a page that is the
same as the original page, but that has a html tag. The
html tag can direct the browser go to a specified url in a given number
of seconds, which can be 0 seconds.

If that is the case, then you need to get the url specified in the
tag. You can examine the html of the page to see if it has a
html tag by looking at the output from:

page = agent.get(‘http://google.com/’)
puts page.body #which appears to be undocumented

Try that and post the first 20 lines or so of the output.

kien · February 12, 2008, 1:28pm

7stud – wrote:

Try that and post the first 20 lines or so of the output.

“\n\nxxxx - Human Control Detector\n<style
type=“text/css”>\n\n\n\n<body
style=“background-color: #000000;”>\n\n<table border=0 width=“100%”
height=“100%”>\n

\n<td valign=“center”
align=“center”>\n\n<table align=center cellpadding=0 cellspacing=0
height=300px width=560px bgcolor=”#999999">\n <img
src=“images/rahmen_aussen_ecke_oben_links.jpg”><td
background=“images/rahmen_aussen_oben.jpg”><img
src=“images/rahmen_aussen_ecke_oben_rechts.jpg”> \n<td
background=“images/rahmen_aussen_links.jpg”
width=30px>\n<table class=“rahmen” align=center width=500px
height=260px cellpadding=0 cellspacing=0>\n \n<td
class=“unten_rahmen” align=center valign=center>\n\n\n<table
cellpadding=0 cellspacing=0 border=0
style=“background-image:url(codepic2.php?t=1202818363);”
width=“312px” >\n\n<td width=“33%”>\n<a
href=“index.php?ahcd_obj=d88eabe93b20f6147cbbf5f3159ce936”><img
src=“images/dummy.gif” width=“100px” height=“128px”
border=“0”>\n\n<td width=“33%”>\n<a
href=“index.php?ahcd_obj=306a58246e04d1fe57c4033b4eef1a12”><img
src=“images/dummy.gif” width=“100px” height=“128px”
border=“0”>\n\n<td width=“33%”>\n<a
href=“index.php?ahcd_obj=e3aa303169b745f2aec63f3dc9184069”>

I cut of the end, because there are no meta-tags.
The site has some kind of script-protection: You have to click the right
of three links. There’s no methode to figure out which one is the right
link, so I login click a link, check the result and if I’m still on the
same page I logout and try again. So far I didn’t come around this
protection in Ruby, although this ‘algorithm’ works very well in the
browser.