I’m trying to scrape a page that hides some data behind a javascript
function. Is there any way to get this data? I’ve been using
Mechanize, but I’m not sure it can do this. Is there a better library
to use for this type of thing?
The following is the interesting part of the page:
The other trick here is that this page is behind a login. Mechanize
allows me to fill out the login form and holds onto the login
credentials for me. Can harmony/celebrity/watir do this?
The really interesting part is what does the Javascript do with
(a potentially large) effort you may be able to “reverse-engineer” the
javascript and emulate manually in mechanize. I.e. if the javascript
builds a simple HTTP request, you may be able to send the same request
from mechanize (possibly) without much effort.
How would one do this? I’m somewhat new to javascript as I usually
don’t do front end engineering. I see the below definition of this
function in the HTML page. Any way I can sniff out what it’s actually
doing? I’m looking to figure out what the fireClick method displays.
<script type="text/javascript">
var d = document.domain.split(".");
document.domain = d[d.length - 2] + "." + d[d.length - 1];
var start = (new Date()).getTime();
var fireClick = function(){};
var omn_hierarchy="US|AMEX|Ser|eStatement";
var omn_pagename="MainPage";
var omn_language="en";
var omn_newpagename="yes";
</script>
On Thu, May 20, 2010 at 1:48 AM, Phil Mcdonnell [email protected] wrote:
I’m trying to scrape a page that hides some data behind a javascript
function. Is there any way to get this data? I’ve been using
Mechanize, but I’m not sure it can do this. Is there a better library
to use for this type of thing?
The following is the interesting part of the page:
The really interesting part is what does the Javascript do with
(a potentially large) effort you may be able to “reverse-engineer” the
javascript and emulate manually in mechanize. I.e. if the javascript
builds a simple HTTP request, you may be able to send the same request
from mechanize (possibly) without much effort.
On Fri, May 21, 2010 at 1:14 AM, Phil Mcdonnell [email protected] wrote:
The other trick here is that this page is behind a login. Â Mechanize
allows me to fill out the login form and holds onto the login
credentials for me. Â Can harmony/celebrity/watir do this?
Watir definitely does that since it simply controls your browser and
therefore behaves exactly like one.
Sorry, don’t have time to look at the page right now, but if it “is
just a clickable image” and not an actual “button” watir’s button
helper may not find it (even though it looks like a button) so try
browser.image().click?
This was captured using the Webmetrics script recorder http://www.webmetrics.com/products/script_recorder.html
It has a Watir compatible mode. You won’t get a working
script out of it but it good for identifying objects.
Inspect Element using FireBug:
A nice helper tool for identify page object such as this Webmetrics
On Mon, May 24, 2010 at 3:36 AM, Phil Mcdonnell [email protected] wrote:
With Watir I’m running into a problem finding the image button for login
on the following page: Login
It looks like the login button is just a clickable image and I should be
able to find it via:
browser.button(:alt, “Login”).click
Any idea why that doesn’t find the button?
Sorry, don’t have time to look at the page right now, but if it “is
just a clickable image” and not an actual “button” watir’s button
helper may not find it (even though it looks like a button) so try
browser.image().click?
Darryl! You just made my day! This does work. I’ve been banging my
head on the wall for a while here I had tried looking for the src
tag too, but not with the full path (only the referential path in the
html).
This was captured using the Webmetrics script recorder http://www.webmetrics.com/products/script_recorder.html
It has a Watir compatible mode. You won’t get a working
script out of it but it good for identifying objects.
Inspect Element using FireBug:
A nice helper tool for identify page object such as this Webmetrics
Good luck,
Darryl
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.