Hi,
I have a few questions about parsing HTML:
-
The default docs (rdoc) for HTMLParser (the one that comes with the
Win32 binary distribution) in Ruby are very poor. Where can I find
some good documentation of the module, or better yet a tutorial /
examples ? -
Another question: is HTMLParser built after Perl’s HTML::Parser ?
-
Can someone suggest which is the best parser to tokenize and build
a tree of the HTML document ? Hpricot looks like a nice parser and is
well documented, but I’m not sure it’s suitable.
Thanks in advance