Get Source Code of a http://... site?


#1

Hi all

I want to download the source code of a website using Ruby? How can I
achieve this? …when searching with Google, I only get download links
for the Ruby source etc. :wink:

Thanks
Josh


#2

2009/4/20 Joshua M. removed_email_address@domain.invalid:

I want to download the source code of a website using Ruby? How can I
achieve this? …when searching with Google, I only get download links
for the Ruby source etc. :wink:

The source code of a web site is usually not available apart from the
JavaScript you see in pages - for good reasons (security,
copyright…).

Cheers

robert


#3

Robert K. wrote:

2009/4/20 Joshua M. removed_email_address@domain.invalid:

I want to download the source code of a website using Ruby? How can I
achieve this? …when searching with Google, I only get download links
for the Ruby source etc. :wink:

The source code of a web site is usually not available apart from the
JavaScript you see in pages - for good reasons (security,
copyright…).

Cheers

robert

Oh, I wasn’t clear enough, I just need the XHTML code, no
behind-the-scenes Ruby or PHP or stuff. :wink:

It’s because I want to use captchator.com and I need to check the result
(0 or 1) using an URL like this:

http://captchator.com/captcha/check_answer/#{captcha_code}/#{@comment.captcha}


#4

2009/4/20 Joshua M. removed_email_address@domain.invalid:

Oh, I wasn’t clear enough, I just need the XHTML code, no
behind-the-scenes Ruby or PHP or stuff. :wink:

There are mechanize, hpricot and a few other alternatives that can
help you there.

Cheers

robert


#5

Robert K. wrote:

2009/4/20 Joshua M. removed_email_address@domain.invalid:

Oh, I wasn’t clear enough, I just need the XHTML code, no
behind-the-scenes Ruby or PHP or stuff. :wink:

There are mechanize, hpricot and a few other alternatives that can
help you there.

Cheers

robert

Thanks, but that would be a big overkill I guess. I found the following
site which explains how to do it with just a few original methods:

http://blog.thembid.com/2007/08/06/using-ruby-to-scrape-a-web-page/


#6

If you just need to get it, for archive or something, and just chose
ruby, have a look at httrack. Not ruby, just its own app, but it
mirrors pages.

  • Daniel

#7

Daniel Huckstep wrote:

If you just need to get it, for archive or something, and just chose
ruby, have a look at httrack. Not ruby, just its own app, but it
mirrors pages.

  • Daniel

Thanks, I know httrack… had some problems with it lately, though…
:wink: