I’m trying to write a tool that will take a domain as an argument and
make a request to http://onsamehost.com and then capture the list of
domains that share that same IP. I want to parse out those IPs and put
them into an array that I can print to a file later.
resp, data = @http.get2(PATH, {‘User-Agent’ => USERAGENT})
puts resp
puts data
The problem is that I keep getting a redirect
(#Net::HTTPMovedPermanently:0xb7c35ffc), which doesn’t happen when I
make the request from a regular browser.
So I sniffed the regular request with wireshark, and a browser sends a
bunch of additional headers when it makes the request. Cookies,
referrer, etc.
Are any of these headers more necessary than others, and is there a
preferred way to send the headers using Ruby?
The problem is that I keep getting a redirect
(#Net::HTTPMovedPermanently:0xb7c35ffc), which doesn’t happen when I
make the request from a regular browser.
Actually, it does – you just don’t see it.
When you request e.g. http::/example.com most servers will send
a redirect to the default page, e.g. http://example.com/index.html.
You need to either handle it or pass the default page’s full URL.
The problem is that I keep getting a redirect
(#Net::HTTPMovedPermanently:0xb7c35ffc), which doesn’t happen when I
make the request from a regular browser.
That site makes heavy use of redirects. Watch closely while running
queries or check your browser history.
So I sniffed the regular request with wireshark, and a browser sends a
bunch of additional headers when it makes the request. Cookies,
referrer, etc.
Are any of these headers more necessary than others, and is there a
preferred way to send the headers using Ruby?
Headers probably have no effect here.
What you probably want is code like this:
require 'net/http'
require 'uri'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
response = Net::HTTP.get_response(URI.parse(uri_str))
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit -
Thanks, much, Michael. Unfortunately I’m not quite tracking on why that
was necessary. It just seems a bit elaborate given what I thought was a
simple problem.
But I totally appreciate it…I just wish it were something simpler.
The site you’re hitting makes heavy use of redirects (and not really
for their intended purpose). What this means is that you submit your
request for a given URL and the server responds with a redirect and a
new URL. If you are working in a browser, your browser automatically
requests that URL, and the server again responds with a redirect and a
new URL. Again, a web browser handles requesting that next URL
automatically. This URL is the actual results page with the data you
want. It’s the web site making you jump through hoops to get where you
want to go.
Net::HTTP does not have a built in facility for following redirects
the way your browser does. So you have to write code to follow
redirects by submitting new requests until you get to one that is not
a redirect, which is what the fetch() method from the Net::HTTP
example does.
Thanks, much, Michael. Unfortunately I’m not quite tracking on why that
was necessary. It just seems a bit elaborate given what I thought was a
simple problem.
But I totally appreciate it…I just wish it were something simpler.
The site you’re hitting makes heavy use of redirects (and not really
for their intended purpose). What this means is that you submit your
request for a given URL and the server responds with a redirect and a
new URL. If you are working in a browser, your browser automatically
requests that URL, and the server again responds with a redirect and a
new URL. Again, a web browser handles requesting that next URL
automatically. This URL is the actual results page with the data you
want. It’s the web site making you jump through hoops to get where you
want to go.
The problem is that I keep getting a redirect
(#Net::HTTPMovedPermanently:0xb7c35ffc), which doesn’t happen when I
make the request from a regular browser.
So I sniffed the regular request with wireshark, and a browser sends a
bunch of additional headers when it makes the request. Cookies,
referrer, etc.
Are any of these headers more necessary than others, and is there a
preferred way to send the headers using Ruby?
We have had similar issues where we didn’t see a redirect when sniffing
the browser but it happened for our code. The reason was HTTP/1.1.
With HTTP/1.1 it is required to specify the host you expect to be
talking with (as more than one virtual host may be serviced by one
server):
GET / HTTP/1.1
Host: www.apache.org
(see http://www.apacheweek.com/features/http11 for reference)
Hope that helps in avoiding the redirect
Uwe
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.