Forum: Mongrel clients hang on large PUTs to Mongrel::HttpHandler-based web service

7fb80bc92b5e6b7c5828e8873c1b97c8?d=identicon&s=25 Randy Fischer (Guest)
on 2008-06-03 23:10
(Received via mailing list)
Hi folks,

I have a problem with a storage web service our group wrote using
Mongrel::HttpHandler We have a consistent problem when using
http PUT to this service when the data is larger than about 4 GB.

The web service actually retrieves and processes the data, but the
clients hang - the TCP connection is still in the ESTABLISHED
state on the client side, but the TCP session no longer exists on
the server side (the temporary file is unlinked from the directory
and, somewhat later, from the mongrel process)

I'm using mongrel 1.1.4, and as far as clients go, I've tried curl and
a java-based application using the jakarta commons http client
software - same issue.

I'm wondering if this is a simple 32-bit int issue in the
ragel-generated
code?

Any advice on how to approach debugging/fixing this would be
appreciated - this is very repeatable.

 Work-arounds would be met with almost equal glee.

Thanks,

-Randy Fischer
8c43ed7f065406bf171c0f3eb32cf615?d=identicon&s=25 Zed A. Shaw (Guest)
on 2008-06-04 02:25
(Received via mailing list)
On Tue, 3 Jun 2008 17:09:05 -0400
"Randy Fischer" <rf@ufl.edu> wrote:

> Hi folks,
>
> I'm wondering if this is a simple 32-bit int issue in the ragel-generated
> code?

Shouldn't be, since the ragel code is only used to parse the headers,
and when that's done it then just streams 16k chunks from the socket to
a tempfile.  Now, if your headers are 4G then I'd like to know how you
did that since Mongrel would block you hard.

Only thing that I could think of is that you aren't setting a size
properly at some point.  Either your header reports the wrong size, or
you're not setting it.

--
Zed A. Shaw
- Hate: http://savingtheinternetwithhate.com/
- Good: http://www.zedshaw.com/
- Evil: http://yearofevil.com/
Dcca95e5f8c0b97b930086b434027a90?d=identicon&s=25 Michael D'Auria (Guest)
on 2008-06-04 02:31
(Received via mailing list)
Randy,
Are you sure this is an issue with the size of the input and not the
amount
of time that the connection is left open?

Michael
7fb80bc92b5e6b7c5828e8873c1b97c8?d=identicon&s=25 Randy Fischer (Guest)
on 2008-06-04 06:12
(Received via mailing list)
On Tue, Jun 3, 2008 at 8:30 PM, Michael D'Auria
<michael.dauria@gmail.com> wrote:
> Randy,
> Are you sure this is an issue with the size of the input and not the amount
> of time that the connection is left open?
> Michael

I'll check by using a smaller filesize that I know will work, using
curl's bandwidth limit feature
to really increase the connection time.

Thanks for the suggestion!

-Randy
7fb80bc92b5e6b7c5828e8873c1b97c8?d=identicon&s=25 Randy Fischer (Guest)
on 2008-06-04 06:39
(Received via mailing list)
On Tue, Jun 3, 2008 at 8:16 PM, Zed A. Shaw <zedshaw@zedshaw.com> wrote:
> On Tue, 3 Jun 2008 17:09:05 -0400
> "Randy Fischer" <rf@ufl.edu> wrote:
>>
>> I'm wondering if this is a simple 32-bit int issue in the ragel-generated
>> code?
>
> Shouldn't be, since the ragel code is only used to parse the headers,
> and when that's done it then just streams 16k chunks from the socket to
> a tempfile.  Now, if your headers are 4G then I'd like to know how you
> did that since Mongrel would block you hard.

Naw, it's the content length in the body of a PUT,  I ask since I saw

   int content_length

in http11_parser.c

> Only thing that I could think of is that you aren't setting a size
> properly at some point.  Either your header reports the wrong size, or
> you're not setting it.

Easily double checked with tcpdump, and the curl dump headers stuff.
And so I will - thanks for the suggestion.

-Randy
8c43ed7f065406bf171c0f3eb32cf615?d=identicon&s=25 Zed A. Shaw (Guest)
on 2008-06-04 22:25
(Received via mailing list)
On Wed, 4 Jun 2008 00:35:16 -0400
"Randy Fischer" <rf@ufl.edu> wrote:

> On Tue, Jun 3, 2008 at 8:16 PM, Zed A. Shaw <zedshaw@zedshaw.com> wrote:

> Naw, it's the content length in the body of a PUT,  I ask since I saw
>
>    int content_length

Well, looking in the source I can't see where that's actually used to
store the Content-Length header value.  It actually seems to be dead.
Instead you have this line in http_request.rb:

content_length = @params[Const::CONTENT_LENGTH].to_i

Now, that means it relies on Ruby's base integer type to store the
content length:

http://www.ruby-doc.org/core/classes/Fixnum.html

""A Fixnum holds Integer values that can be represented in a native
machine word (minus 1 bit). If any operation on a Fixnum exceeds this
range, the value is automatically converted to a Bignum.""

Which is kind of vague, but there's a good chance it's implemented as a
32-bit signed integer giving you a problem with a 4G content size.  It
should be converted to a Bignum on overflow, but a quick test would be
to check the class of the content_length right after this line to see
what it's getting.

--
Zed A. Shaw
- Hate: http://savingtheinternetwithhate.com/
- Good: http://www.zedshaw.com/
- Evil: http://yearofevil.com/
7fb80bc92b5e6b7c5828e8873c1b97c8?d=identicon&s=25 Randy Fischer (Guest)
on 2008-06-05 02:19
(Received via mailing list)
Great! I'll check the content length - right now, it's looking to
be some sort of network (maybe firewall) issue.  When I have
it figured out I will report back.  But instrumenting the mongrel
handler I wrote shows that it's attempting to put out a reasonable
response header.

Unfortunately security on this system is difficult - I just got
sudo access to tcpdump... I started out as a sysadmin,  but
man, they drive me crazy sometimes...

Did I say that mongrel rocks?

Thanks Zed.

-Randy
7fb80bc92b5e6b7c5828e8873c1b97c8?d=identicon&s=25 Randy Fischer (Guest)
on 2008-07-14 01:07
(Received via mailing list)
Follow up to an old problem, finally solved, in case anyone else
stumbles across the same problem.

> I have a problem with a storage web service our group wrote using
> Mongrel::HttpHandler We have a consistent problem when using
> http PUT to this service when the data is larger than about 4 GB.

Well, it turns out I could only repeat it consistently between two
particular systems.  There was some back and forth on this
list, and I threw out the red herring that the http11_parser.c code
used an unsigned int for the content size.   Zed pointed out that
particular variable was just dead code:

> Instead you have this line in http_request.rb:
>
> content_length = @params[Const::CONTENT_LENGTH].to_i
>
> Now, that means it relies on Ruby's base integer type to store the
> content length:

Since @params[Const:CONTENT_LENGTH] is a string,  ruby's
to_i method can get it right,  casting to a fixnum internally when
necessary - integer overflow was not the issue.


On Tue, Jun 3, 2008 at 8:30 PM, Michael D'Auria
<michael.dauria@gmail.com> wrote:
> Randy,
> Are you sure this is an issue with the size of the input and not the
amount
> of time that the connection is left open?
> Michael

That turns out to be the correct answer, though I eliminated it
(incorrectly)
by using curl's limit-bandwidth option to get times greater than that
exhibited by my 4GB transfers - those all worked.

What was causing the problem was the lag between the end of the
upload/request from the client, to the time when the server finally
sent a response after processing the request (the processing time was
entirely taken up with copying the upload as a temporary mongrel
file to its permanent disk file location).

Still,  using tcpdump showed that the response was making it back
to the client from the server intact and correctly.

What was timing out was the firewall on the client system, which
was using statefull packet filtering (iptables on an oldish redhat
system).   The dead time in the http request/response had
exceeded the time to live for the state tables.  Turning off the
keep-state  flag in the firewall rules allowed the transfer to
complete.  Now it's just a matter of tweaking the parameters so
we can get keep-state working again.

Thanks for all the help on this.

-Randy Fischer
A0c079a7c3c9b2cf0bffebd84dc578b0?d=identicon&s=25 Chuck Remes (cremes)
on 2008-07-17 16:44
(Received via mailing list)
On Jul 13, 2008, at 6:03 PM, Randy Fischer wrote:

> Follow up to an old problem, finally solved, in case anyone else
> stumbles across the same problem.
>
> > I have a problem with a storage web service our group wrote using
> > Mongrel::HttpHandler We have a consistent problem when using
> > http PUT to this service when the data is larger than about 4 GB.

I wrote a mongrel handler (and a small patch to mongrel) about a year
ago that handled PUT a little more gracefully than the default. It
prevented mongrel from blocking during the upload.

Want me to send you the code? I imagine it's a tad out of date now,
but the idea was sound.

cr
8c43ed7f065406bf171c0f3eb32cf615?d=identicon&s=25 Zed A. Shaw (Guest)
on 2008-07-18 07:58
(Received via mailing list)
On Sun, 13 Jul 2008 19:03:54 -0400
"Randy Fischer" <rf@ufl.edu> wrote:

> What was timing out was the firewall on the client system, which
> was using statefull packet filtering (iptables on an oldish redhat
> system).   The dead time in the http request/response had
> exceeded the time to live for the state tables.  Turning off the
> keep-state  flag in the firewall rules allowed the transfer to
> complete.  Now it's just a matter of tweaking the parameters so
> we can get keep-state working again.

Ah, yes, classic mistake.  People tend to think the client side just
works, but things like firewalls, routers, those stupid anti virus
programs, many times are more likely causes of trouble.

Good job.
This topic is locked and can not be replied to.