Accessing dynamic javasript with Ruby

Hey all

I’m experimenting with writing a scraper at the moment and have hit a
major hump.

Part of the DOM is added after the page has loaded via javascript.

This means when I make an a request the HTML response I receive back
doesn’t accurately represent the page.

Here’s a simplified example:

@http_obj = Net::HTTP.new(“targetdomain.com”)

response, page_data = @http_obj.request_get( “/” )

page data doesn’t contain all of the HTML that is actually shown

Is there anyway library or gem that could simulate the browser
updating the DOM with the Javascript or any other way I could approach
this short of decoding the obfuscated Javascript file?

Thanks in advance

Gav

Gavin M. wrote:

Hey all

I’m experimenting with writing a scraper at the moment and have hit a
major hump.

Part of the DOM is added after the page has loaded via javascript.

This means when I make an a request the HTML response I receive back
doesn’t accurately represent the page.

Here’s a simplified example:

@http_obj = Net::HTTP.new(“targetdomain.com”)

response, page_data = @http_obj.request_get( “/” )

page data doesn’t contain all of the HTML that is actually shown

Is there anyway library or gem that could simulate the browser
updating the DOM with the Javascript or any other way I could approach
this short of decoding the obfuscated Javascript file?

Try Selenium or some other remote browser control.

Thanks in advance

Gav

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs