Scraping 3rd element with hpricot


#1

I don’t know why I’m struggling with this (not a Ruby expert) but using
the standard scraping example:

doc = Hpricot(@response)
puts (doc/"#{xpath}").first.inner_html
puts (doc/"#{xpath}").last.inner_html

What could I use instead of first or last to get the nth element?

Thanks!


#2

On Tue, Dec 9, 2008 at 2:09 PM, Mark N. removed_email_address@domain.invalid wrote:

I don’t know why I’m struggling with this (not a Ruby expert) but using
the standard scraping example:

doc = Hpricot(@response)
puts (doc/"#{xpath}").first.inner_html
puts (doc/"#{xpath}").last.inner_html

What could I use instead of first or last to get the nth element?

Hpricot::Doc#search returns an instance Hpricot::Elements, which is a
subclass of Array, so you can index it like any other array.

puts (doc/"#{xpath}")[n].inner_html # we’re counting from 0 here, so
to get the 3rd element n=2

However, you will get an error with this if there are not enough
elements returned by the search. So it might be better to do the
search, validate that there are as many elements as you expect and
then get the element you want by index. Like so:

puts x.inner_html if x = (doc/"#{xpath}")[n]

-Michael


#3

Michael L. wrote:

puts (doc/"#{xpath}")[n].inner_html # we’re counting from 0 here, so
to get the 3rd element n=2

Thanks Michael, sometimes it’s right in front of you. I was trying to
add a dot before [n]…

–Mark