Forum: Ruby Scraping 3rd element with hpricot

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Mark N. (Guest)
on 2008-12-09 22:15

I don't know why I'm struggling with this (not a Ruby expert) but using
the standard scraping example:

  doc = Hpricot(@response)
  puts (doc/"#{xpath}").first.inner_html
  puts (doc/"#{xpath}").last.inner_html

What could I use instead of first or last to get the nth element?

Thanks!
Michael L. (Guest)
on 2008-12-10 00:51
(Received via mailing list)
On Tue, Dec 9, 2008 at 2:09 PM, Mark N. <removed_email_address@domain.invalid> 
wrote:
>
>
> I don't know why I'm struggling with this (not a Ruby expert) but using
> the standard scraping example:
>
>  doc = Hpricot(@response)
>  puts (doc/"#{xpath}").first.inner_html
>  puts (doc/"#{xpath}").last.inner_html
>
> What could I use instead of first or last to get the nth element?

Hpricot::Doc#search returns an instance Hpricot::Elements, which is a
subclass of Array, so you can index it like any other array.

puts (doc/"#{xpath}")[n].inner_html  # we're counting from 0 here, so
to get the 3rd element n=2

However, you will get an error with this if there are not enough
elements returned by the search. So it might be better to do the
search, validate that there are as many elements as you expect and
then get the element you want by index. Like so:

puts x.inner_html if x = (doc/"#{xpath}")[n]

-Michael
Mark N. (Guest)
on 2008-12-10 03:44
Michael L. wrote:

>
> puts (doc/"#{xpath}")[n].inner_html  # we're counting from 0 here, so
> to get the 3rd element n=2
>

Thanks Michael, sometimes it's right in front of you. I was trying to
add a dot before [n]...

--Mark
This topic is locked and can not be replied to.