Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like
that?
thanks
yuesefa
Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like
that?
thanks
yuesefa
Haofei wrote:
Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something
like that?
thanksyuesefa
You can see an extremely simple/limited one that I made a while back:
http://students.seattleu.edu/collinsj/programs_netcrawler.html
It may give you a place to start, but there are very good libraries for
getting and parsing websites, like Rubyful Soup, Mechanize, open-uri,
and so on.
Also, try searching the archives for more.
-Justin
On Thu, Aug 24, 2006 at 02:09:28AM +0900, Haofei wrote:
Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like
that?
thanksyuesefa
You can write one with WWW::Mechanize. I have an example on my blog:
http://tenderlovemaking.com/2006/05/26/mechanize-one-liners/
There is also an example spider that comes along with Mechanize, just
look in the ‘eg’ directory.
Here is the spider for those that don’t want to click (its not perfect,
but its small!):
(mech = WWW::Mechanize.new).get(ARGV[0])
(a = lambda { |p|
mech.page.links.each { |l|
mech.click(l) && p.call(p) if ! mech.visited? l
}
}).call(a)
–Aaron
I found hpricot easy to use :
http://code.whytheluckystiff.net/hpricot/
There’s some good code examples on the site too.
Chris
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs