Forum: Ruby extracing the URL from hpricot element

Announcement (2017-05-07): is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see and for other Rails- und Ruby-related community platforms.
D5895a1542a6a90b1dcfebf4392d2b9b?d=identicon&s=25 Nikita Ratlos (nratlos)
on 2008-12-10 19:26
I want to get a list of URLs from a webpage as follows:

First I create the Hpricot element as follows
doc = Hpricot(open(searchurl))

links = doc/"//html//body//div[6]//div[2]//a[@id='p-1']" +#

Next I want to append the URLs to an array as such:

results <<{|link| puts link.attributes['href'] }

The line nicely prints out the URLs how I need them, but then
puts the whole HTML link in the results array.

Any ideas how to get the URLs (without the HTML) into my results array ?
Ad97b577f331ae29ed90da5751f2e44f?d=identicon&s=25 Dan Diebolt (dandiebolt)
on 2008-12-10 19:56
(Received via mailing list)
Use inject:

require 'hpricot'
require 'open-uri'

doc = Hpricot(open(""))

(doc/"//a").inject([]) do |links,anchor|
  links << anchor.attributes['href']
This topic is locked and can not be replied to.