How to parse HTML doc in Ruby?

Hi,

I want to parse the html doc using ruby.
I tried using reXML but failed to load html doc as it is not in well
formed structure.
Can you please suggest me a good parser which I can use to parse HTML
page using Ruby?

Thanks,
Karika.

Karika wrote:

Hi,

I want to parse the html doc using ruby.
I tried using reXML but failed to load html doc as it is not in well
formed structure.
Can you please suggest me a good parser which I can use to parse HTML
page using Ruby?

Thanks,
Karika.

I’ve had good luck with Rubyful Soup:

http://rubyforge.org/projects/tidy/