Forum: Ruby Help with Hpricot and collect

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
06ca7bf5b4480fc2b272ab03f4c2ac3a?d=identicon&s=25 John Zoldiark (zoldiark)
on 2008-12-18 01:48
Hi and thanks for the help, I am new to ruby and I am trying to get some
data from a website using the Hpricot library.I am doing this: links =
Hpricot(index_page).search("td.det_movie").collect{ |link| link["href"]
} to get an array of data but the thing is that right now I need some
text out of tags and the previous version only works with tags.
This is part of the code that I am reading, and I need the part that is
not in tags:

<table cellspacing="0" cellpadding="0" border="0" width="289">
<tbody>
<tr>
</tr>
<tr>
<td height="5"/>
</tr>
<tr>
<td class="det_movie" align="left">6:25 9:20 p.m. (Mon./Fri.) 4:00 p.m.
(No previews) (Sat./Sun./Hol.) 1:00 p.m. (No previews) 3:30 p.m.</td>
</tr>
<tr>

the file is very large and I need all the objects with the
class="det_movie".

thanks for any help
Ad97b577f331ae29ed90da5751f2e44f?d=identicon&s=25 Dan Diebolt (dandiebolt)
on 2008-12-18 11:22
(Received via mailing list)
require 'hpricot'

doc=Hpricot('<a class="link" href="detalle_movie.php?mv_id=618">The Day
the Earth Stood Still</a><a class="link"
href="detalle_movie.php?mv_id=618">The Day the Earth Stood Still</a>')

links=(doc/"//a.link").collect do |link|
  [link[:href],link.inner_text]
end

=> [
["detalle_movie.php?mv_id=618", "The Day the Earth Stood Still"],
["detalle_movie.php?mv_id=618", "The Day the Earth Stood Still"]
]
This topic is locked and can not be replied to.