A Little Help with Hpricot Parsing

I need to scrape out the name, address, city, state, zip, etc. from
site. Due to their wonderful coding, they didn’t put lines of text in

tags. So Hpricot doesn't see the address (et al) as an element, but it does see the
as empty elements.

So how do I pick out the address and so forth if they are not in
elements?

Here is what I have scraped so for, down to my target table:

Bank of America

405 N. 3rd Street
Phoenix, Arizona 60606
Distance: N/A
800-555-1212
  • June 15 - 20 2008
    6:00 PM - 9:00 PM
    Event Theme Name

Washington Mutual

3705 Beaver Creek Rd
Austin, Texas 60606
Distance: N/A
800-555-1212
  • July 07 - 11 2008
    9:00 AM - 12:00 AM
    Event Theme Name

So how do I pick out the address and so forth if they are not in
elements?

Use that fact to your advantage. The only non-elements in the table
cell are your addresses.

doc.search(’/table/tr/td/*’).each do |node|
print node unless node.elem?
end

This will print out the addresses only.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs