400 "Bad Request"

Hi,
I’m developing a program to fetch the html contents of a site using
‘net/http’. Everyting works fine except for http://www.youtube.com. When
i pass that url an error like this is found

/usr/lib/ruby/1.8/net/http.rb:2097:in `error!': 400 “Bad Request”
(Net::HTTPServerException)

I think it is because i’m not using any user agent. Can any body please
tell me any auggestion. I’ll be really greatful. This is my code
snippet.

response = Net::HTTP.get_response(URI.parse(“http://www.youtube.com”))
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then response =
Net::HTTP.get(URI.parse(response[‘location’]))
else
response.error!
end

Thanks

regards
Arun K.

Hi

On Thu, Mar 26, 2009 at 12:21 PM, Arun K.
[email protected] wrote:

snippet.
You can try getting the page with Mechanize [1].

irb -rubygems -rmechanize
irb(main):001:0> agent=WWW::Mechanize.new
irb(main):002:0> agent.get(‘http://www.youtube.com’).code
=> “200”

Anyway, it seems it’s as you said, without a User-Agent, youtube returns
a 400:
irb(main):003:0> agent.user_agent=nil
=> nil
irb(main):004:0> agent.get(‘http://www.youtube.com’).code
WWW::Mechanize::ResponseCodeError: 400 => Net::HTTPBadRequest
from
/usr/lib/ruby/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:229:in
`get’
from (irb):4
irb(main):005:0>

[1] http://mechanize.rubyforge.org/

Arun K. writes:

Hi,
I’m developing a program to fetch the html contents of a site using
‘net/http’. Everyting works fine except for http://www.youtube.com.
When
i pass that url an error like this is found

/usr/lib/ruby/1.8/net/http.rb:2097:in `error!': 400 “Bad Request”
(Net::HTTPServerException)

I think it is because i’m not using any user agent. Can any body
please
tell me any auggestion. I’ll be really greatful. This is my code
snippet.

response =
Net::HTTP.get_response(URI.parse(“http://www.youtube.com”))
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then response =
Net::HTTP.get(URI.parse(response[‘location’]))
else
response.error!
end

Thanks

regards
Arun K.

Posted via http://www.ruby-forum.com/.

From a quick look at the Net::HTTP RDoc, it would look like your best
option would be to use the public instance ‘get’ method.

Here is a suggested rework of your code snippet:

Net::HTTP.start(‘www.youtube.com’, 80) {|http|
response = http.get(‘/’, {‘User-Agent’=>‘ruby/net::http’})
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then response =
Net::HTTP.get(URI.parse(response[‘location’]))
else
response.error!
end
}

Of course, this quick snippet does not account for when you get a
Net::HTTPRedirection and the redirected host is ‘www.youtube.com’,
again. (Then again, your original code didn’t error check the
redirection response either.) So this isn’t a complete solution, but
it will at least show you how to send the “User-Agent” header; I used
a variation of this code and was able to get content from
www.youtube.com.

Coey