Screenscraping using htmltools and rexml


#1

Hi,
I need to do some screen scraping and I’ve spent a couple hour getting
htmltools and rexml do the right thing. Here’s the code:

parser = HTMLTree::Parser.new(false, false)
parser.feed(res.body)
tree = parser.tree.html_node.as_rexml_document

I works for one page, but for another I get “undefined method `add’ for
#HTMLTree::Element:0x37f9cc8” in as_rexml_document

It seems like a library mismatch, but I just downloaded ruby and all
the libraries in the past couple days. Does anybody know what versions
I need to make this work?

Btw, the versions I have now are:
htmltools 1.09
rexml 3.1.2.1

And I also tried rexml 3.1.3 and the “stable” version of rexml 2.4.8,
but none of them work.

Thanks a lot!
Peter