How to get all image, pdf and other files links from a website?

I have to develop an application which fetches all the images, pdf, cgi,
etc. file extension links from website.

Can anybody guide me from where should I begin?

You can find usefully information at
http://railscasts.com/episodes?utf8=✓&search=nokogiri

Specially Mechanize

[]'s


Felipe Fontoura
Eng. de Computao
http://www.felipefontoura.com

2012/1/4 cyber y. [email protected]

Well wget has a mirror mode that will clone a website

wget --mirror http://www.example.com

or you could look at nutch (http://wiki.apache.org/nutch/) which is a
web crawler for building searches.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs