Parsing problems using https and redirects

Hello list,
I have to develop a simple script to parse some parts of a web site
and I
thought it could be a good opportunity to start trying Ruby.
I found that there are two network libraries that I could supposedly
use
to retrieve the contents of the web site: open-uri and net-http.

First problem
This web site is accessed only with https and has a self issued
certificate. This has made it impossible so far for me to access the
contents of the web site.
Simple examples from the Hpricot html parsing library like this one:

require ‘hpricot’
require ‘open-uri’
doc = Hpricot(open(“https://xxxxxx”))

will not work because the open will fail because of problems due to
https.

Second problem
I need to know also how to handle redirection and cookies. But to be
fair, I still can do some further reading myself on these issues.

Thank you very much.

Quoth Ramiro Diaz Trepat:

Simple examples from the Hpricot html parsing library like this one:
fair, I still can do some further reading myself on these issues.

Thank you very much.

  1. Look at mechanize.

  2. Look at http-access2 (or whatever it’s been renamed to).

Regards,

Thank you very much Konrad, it seems that I am on my way now.
The only weird thing that happened now with Mechanize is that it all
works
perfectly on my Linux but it doesn´t on my Mac/Leopard.
Both have Ruby 1.8.6

On the mac I get the following error while trying to execute the first
Mechanize example:

./mechanize.rb:4: uninitialized constant WWW (NameError)
from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:27:in
gem_original_require' from /Library/Ruby/Site/1.8/rubygems/custom_require.rb:27:inrequire’
from goog.rb:2

and the code is the first example of machanize:

require ‘rubygems’
require ‘mechanize’

agent = WWW::Mechanize.new
agent.user_agent_alias = ‘Mac Safari’
page = agent.get(“http://www.google.com/”)
search_form = page.forms.with.name(“f”).first
search_form.q = “Hello”
search_results = agent.submit(search_form)
puts search_results.body

I really don’t know why this constant is uninitialized and how could I
initialize it. Besides it worries my that on Linux, after installing
the
mechanize gem, everything worked out of the box.

Thanks again

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs