Hi all.
Noob, first project, read the Poignant Guide, et al.
I have a big Perl script that parses badly-formed HTML files with HTML
Element/Tree. I think it’s time for an update.
I think the equivalent in Ruby is Hpricot? I haven’t found a lot of dox
on this, so I am assuming that this type of problem is something that
becomes ‘obvious’ once you start working in Ruby. Or should I be
looking at another/better solution (as in, duh, it’s got XXX built-in,
noob…)?
TIA
Michael Lesser wrote:
Hi all.
Noob, first project, read the Poignant Guide, et al.
I have a big Perl script that parses badly-formed HTML files with HTML
Element/Tree. I think it’s time for an update.
I think the equivalent in Ruby is Hpricot? I haven’t found a lot of dox
on this, so I am assuming that this type of problem is something that
becomes ‘obvious’ once you start working in Ruby. Or should I be
looking at another/better solution (as in, duh, it’s got XXX built-in,
noob…)?
TIA
You might want to take a look at html5lib <
html5lib · GitHub > for parsing bad markup.