Html parsing with Hpricot

Hi,
I need to parse a wonderful html page full of tables everywhere!
Obviously I am using Hpricot to parse my html, and this what I’ve done
so far.
But now I stuck :S

page.search("#profile > table > tr")[1].at(“td”).at(“table”)

In my table element I now need to fetch childs “tr”, so I’ve done this:

page.search("#profile > table >
tr")[1].at(“td”).at(“table”).search(“tr”)

But this isn’t working because it fetches the childs and inside the
childs.
HOw can I fetch just the childs elements?

Greg

Greg Ma wrote:

Hi,
I need to parse a wonderful html page full of tables everywhere!
Obviously I am using Hpricot to parse my html, and this what I’ve done
so far.
But now I stuck :S

page.search("#profile > table > tr")[1].at(“td”).at(“table”)

In my table element I now need to fetch childs “tr”, so I’ve done this:

page.search("#profile > table >
tr")[1].at(“td”).at(“table”).search(“tr”)

But this isn’t working because it fetches the childs and inside the
childs.
HOw can I fetch just the childs elements?

Greg

This made the trick
page.search("#profile_bandschedule > table >
tr")[1].at(“td”).at(“table”).search("/tr")

It’s not so obvious that you are using Hpricot. You could use Nokogiri
too.
I prefer Nokogiri (http://nokogiri.org/)

Juan José Vidal

El 09/06/10 15:35, Greg Ma escribió: