Getting Response from HTTPS POST


I am writing a crawler to parse webpages. One site that I am crawling
requires me to log in, so I use an HTTPS POST to log in. However, once
I send the POST I can’t get anywhere because I have to have a valid
session id in the URL. If I log in using FireFox, the session id is
appended to the URL for every page that I visit (something like How can I get this session ID
so that I can append it to my URLs and crawl the page? They used to
send the session id in a cookie but they no longer use cookies (you
will see the attempt to get the cookie still in this code). Here is
what I have:

require 'net/https'
require 'uri'

  url = '<appropriate URL here>'
  uri = URI.parse(url)
  http =, uri.port)
  http.use_ssl = uri.scheme == 'https'
  http.verify_mode = OpenSSL::SSL::VERIFY_NONE

  response = self.get_data(http, uri, headers)
  page = response.body

  #grab hidden field from the page
  view_state = CGI::escape(page[/<input type="hidden"

name=“__VIEWSTATE” value=“([^”]*)"/, 1])
post_data = ‘’

  login_response,data ='<appropriate path here>',

post_data, headers)

  cookie = nil
  location = nil
  login_response.each_header do |name, value|
    cookie = value[0, value.index(';')] if name == 'set-cookie'
    location = value if name == 'location'

  headers['Cookie'] = cookie

  if location
    homepage = get_data(http, URI.parse('<appropriate URI

here>‘+location), headers).body
homepage = get_data(http, URI.parse(’'),

  start_with_homepage(homepage, http, headers)