Connection reset by peer (Errno::ECONNRESET)


#1

Hi,
I’m a new bee to Ruby and I’m trying to parse the html content from a
website using net/http. All urls work fine except for
https://www.google.com/accounts/AuthSubRequest?scope=http%3A%2%2Fwww.google.com%2Fcalendar%2Ffeeds%2F&session=0&secure=0&next=http%3A%2F%2Fwww.google.com”.
When I try to access this url I get an error like this :

/usr/lib/ruby/1.8/net/protocol.rb:135:in sysread': Connection reset by peer (Errno::ECONNRESET) from /usr/lib/ruby/1.8/net/protocol.rb:135:inrbuf_fill’
from /usr/lib/ruby/1.8/timeout.rb:62:in timeout' from /usr/lib/ruby/1.8/timeout.rb:93:intimeout’
from /usr/lib/ruby/1.8/net/protocol.rb:134:in rbuf_fill' from /usr/lib/ruby/1.8/net/protocol.rb:116:inreaduntil’
from /usr/lib/ruby/1.8/net/protocol.rb:126:in readline' from /usr/lib/ruby/1.8/net/http.rb:2020:inread_status_line’
from /usr/lib/ruby/1.8/net/http.rb:2009:in read_new' from /usr/lib/ruby/1.8/net/http.rb:1050:inrequest’
from ./rssModule.rb:40:in extractor' from /usr/lib/ruby/1.8/net/http.rb:543:instart’
from ./rssModule.rb:38:in extractor' from ./rssModule.rb:19:ineach’
from ./rssModule.rb:19:in `extractor’
from rss.rb:38

I don’t know exactly about the problem ie. whether it is OS dependent or
browser dependent etc… My code snippet is shown below :
uri =
URI.parse(“https://www.google.com/accounts/AuthSubRequest?scope=http%3A%2%2Fwww.google.com%2Fcalendar%2Ffeeds%2F&session=0&secure=0&next=http%3A%2F%2Fwww.google.com”)
header = { “User-Agent”=>“Mozilla/4.0 (compatible; MSIE 5.5; Windows NT
5.0)” }
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl()
http.start do |https|
request = Net::HTTP::Get.new(uri.path,
header)
response = https.request(request,
header)
response.value
end

Please help me. I’m in big trouble if I can’t solve it.

Regards
Arun


#2

First, the URL is badly encoded, in the end of the string:
http%3A%2%2Fwww…"
should be
http%3A%2F%2Fwww…"

Second, to access a protected resource, it’s not enought just to start
a URL with https :slight_smile:

You need to know the login & password, once you know them you can
should set them before performing a request:

request.basic_auth ‘account’, ‘password’

Dmitry


#3

Dmitry S. wrote:

First, the URL is badly encoded, in the end of the string:
http%3A%2%2Fwww…"
should be
http%3A%2F%2Fwww…"

Second, to access a protected resource, it’s not enought just to start
a URL with https :slight_smile:

You need to know the login & password, once you know them you can
should set them before performing a request:

request.basic_auth ‘account’, ‘password’

Dmitry

Hi,
Thanks for the reply. I applied basic authentication and changed the
url but still it is showing the same error. Besides can you please tell
me is this error generated only for those sites which require basic
authentication or this error can be generated for other sites which
doesnot require any type of authentication at all. I’ll be pleased to
hear from you.

Thanking you

Regards
Arun


#4

The error will be generated anytime when the server doesn’t want to
talk to you, eg. because of auth failure or service outage.

And about basic auth, actually it’s only one of the many ways to
authenticate on the web, and I’m sure that Google Account protected
services support it. Probably the only way to fullfil such request is
to emulate a browser request and store all cookies & headers the
google will set. But it won’t be easy, if possible at all. So you’ll
better find an API way to access the service, either don’t scrap such
URLs. And also I’m quite sure that it violates Google Terms.

Dmitry


#5

Dmitry S. wrote:

The error will be generated anytime when the server doesn’t want to
talk to you, eg. because of auth failure or service outage.

And about basic auth, actually it’s only one of the many ways to
authenticate on the web, and I’m sure that Google Account protected
services support it. Probably the only way to fullfil such request is
to emulate a browser request and store all cookies & headers the
google will set. But it won’t be easy, if possible at all. So you’ll
better find an API way to access the service, either don’t scrap such
URLs. And also I’m quite sure that it violates Google Terms.

Dmitry

Hi Dmitry,
Thanks very much for the suggestion. Anyway have a few doubts:

  1. Is this error has any connection with Connection TimeOut?
  2. Is there any possible way to resolve the error other than you
    mentioned ie. anything I can do to resolve the proble.

Thanks again for the quick reply

Regards
Arun K.


#6
  1. No, I don’t think that the timeout is a problem there.
  2. As I said, I think that there are no easy way or no way at all to
    do it without using google apis.

Dmitry