How to parse rendered html page

I’m trying to write a website that parses all images in a given
webpage. I initially tried to get all image links by looking for
tag, by using nokogiri html parser, and it works well with webpages
without javascript.

Some pages use javascript to render the view, and using nokogiri, I’m
just getting raw html results before it’s rendered.

How can I get a page after being rendered by javascript?

You can try with http://scrubyt.org/index.html that is like mechanize
but supports javascript.

parkurm wrote:

I’m trying to write a website that parses all images in a given
webpage. I initially tried to get all image links by looking for
tag, by using nokogiri html parser, and it works well with webpages
without javascript.

Some pages use javascript to render the view, and using nokogiri, I’m
just getting raw html results before it’s rendered.

How can I get a page after being rendered by javascript?

Webrat + Selenium would be one way.

Best,
–Â
Marnen Laibow-Koser
http://www.marnen.org
[email protected]

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs