Scrape data from Javascript inside HTML source


#1

Hi,
is there any scraper that interact with Javascript function inside a
HTML page? Sometimes, the data is returned from a Javascript function
or a javascript variable. so I wonder if there’s a easy way to get data
out by evaluating and evaluating the javascript based on the context of
the page in Ruby?

Yaxm


#2

Yaxm Y. wrote:

is there any scraper that interact with Javascript function inside a
HTML page? Sometimes, the data is returned from a Javascript function
or a javascript variable. so I wonder if there’s a easy way to get data
out by evaluating and evaluating the javascript based on the context of
the page in Ruby?

In test? Or “scraping” a target website to see what it’s got?

Either way, I would use Nokogiri to rip the HTML and find tags,
then
use racc and rkelly to interpret the JavaScript and find its variables.

By “would” I mean I already do that. Here’s the rkelly calls required:

   RKelly.parse(js).pointcut('TargetMethod()').  # with the ()
       matches.each do |updater|
     updater.grep(RKelly::Nodes::ArgumentsNode).each do |thang|
       p thang
     end
   end

However, if you are attacking other peoples’ websites to scrape out
their data,
you might instead try Watir. It just runs a web browser and evaluates
its JS
directly.