Forum: Ruby Potential bug in Net::HTTP, and tentative patch

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Yves-Eric Martin (Guest)
on 2009-04-13 07:41
(Received via mailing list)
Hi all,


I am building a nice little system using Ruby (on Rails), one part of
which uses Net::HTTP to retrieve some data over HTTP. Everything seems
to work fine, but on some requests, I get an EOFError.

As I found out, this problem has already been reported, but without any
answer. See:
http://rubyforge.org/forum/forum.php?thread_id=288...


I think I may have traced the problem back to a bug in Net::HTTP. Here
is a two-liner to reproduce the error:

$ irb
 >> require 'net/http'
=> true
 >> res =
Net::HTTP.get_response(URI.parse('http://snapcasa.com/get.aspx?code=1000&size=m&url=...))
EOFError: end of file reached
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/protocol.rb:133:in
`sysread'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/protocol.rb:133:in
`rbuf_fill'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/timeout.rb:56:in
`timeout'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/timeout.rb:76:in
`timeout'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/protocol.rb:132:in
`rbuf_fill'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/protocol.rb:116:in
`readuntil'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/protocol.rb:126:in
`readline'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:2236:in
`read_chunked'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:2216:in
`read_body_0'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:2182:in
`read_body'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:2207:in
`body'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:2146:in
`reading_body'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:1061:in
`request'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:957:in
`request_get'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:380:in
`get_response'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:547:in
`start'
    from
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:379:in
`get_response'
    from (irb):2>>


My quick analysis:

In http.rb, line 2236 the "read_chunked" function calls
@socket.readline. This "readline" function, in protocol.rb line 126,
calls readuntil("\n"). This works fine if the data chunk is
"\n-terminated", but throws an EOFError if it is not.

I am not sure about the underlying standards of chunked http, maybe the
data chunk is supposed to always been \n-terminated, and it may be a
mis-behaving server, but the fact is: I can get the example image fine
with any browser, but not with Net::HTTP.

As a tentative fix, I wrote a patch that catches the EOFError in
read_chunked. You will the patch file attached. With the patch, things
work fine:

$ irb
 >> require 'net/http'
=> true
 >> res =
Net::HTTP.get_response(URI.parse('http://snapcasa.com/get.aspx?code=1000&size=m&url=...))
=> #<Net::HTTPOK 200 OK readbody=true>
 >> File.open('test.jpg','w').write res.body
=> 2881
 >>

With the patch, the above gives me a perfectly fine JPEG file. However,
I am afraid my current patch, with a big begin...rescue around most of
the body of the read_chunked function, catches the EOFError at level
higher than necessary, which is not good practice...


Anyway, before continuing any further, could someone involved in the
development of Ruby take a look at this, confirm the existence of the
bug, and maybe even come up with a better fix?

PS: please let me know if I posted this in the wrong list, or if I
should open a bug report on some bug tracking system.


Thank you for your help.
Nobuyoshi N. (Guest)
on 2009-04-13 15:20
(Received via mailing list)
Hi,

At Mon, 13 Apr 2009 12:40:32 +0900,
Yves-Eric Martin wrote in [ruby-talk:333704]:
> In http.rb, line 2236 the "read_chunked" function calls
> @socket.readline. This "readline" function, in protocol.rb line 126,
> calls readuntil("\n"). This works fine if the data chunk is
> "\n-terminated", but throws an EOFError if it is not.

Not "\n-terminated".

According to RFC2616 and RFC2068, chunks consist from
chunk-size and chunk-body, and the chunked-body is terminated
by "0" size chunk.

That is, the response doesn't seem to follow the RFCs.
Yves-Eric Martin (Guest)
on 2009-04-14 07:33
(Received via mailing list)
Thank you for pointing me to the RFC. Indeed, the response does not
seem RFC-compliant...

Other than my quick and dirty patch, is there a way to tell Net::HTTP
to ignore the EOFError and accept non-compliant input? Again, the point
is that an image, which displays fine in Internet Explorer, Firefox and
Safari, cannot be downloaded with Net::HTTP. While I understand the
"not RFC-compliant" argument, for practical reasons, it does seem a bit
limiting...

Thank you,


PS: I will also contact the administrator of the problem site regarding
this
RFC compliance issue.

--
Yves-Eric
Nobuyoshi N. (Guest)
on 2009-04-14 08:26
(Received via mailing list)
Hi,

At Tue, 14 Apr 2009 12:33:01 +0900,
Yves-Eric Martin wrote in [ruby-talk:333820]:
> Other than my quick and dirty patch, is there a way to tell Net::HTTP
> to ignore the EOFError and accept non-compliant input? Again, the point
> is that an image, which displays fine in Internet Explorer, Firefox and
> Safari, cannot be downloaded with Net::HTTP. While I understand the
> "not RFC-compliant" argument, for practical reasons, it does seem a bit
> limiting...

See rdoc of Net::HTTPResponse#read_body and
Net::HTTP#request_get.

  out = "" # or open(destfile, "wb")
  begin
    Net::HTTP.get_response(uri) do |res|
      res.read_body {|s| out << s}
    end
  rescue EOFError
  end
Yves-Eric Martin (Guest)
on 2009-04-14 12:33
(Received via mailing list)
Works like a charm!

Thank you Nobu for your great help. I owe you a beer.


--
Yves-Eric
This topic is locked and can not be replied to.