Nginx and upstream Content-Length

dubstep · September 30, 2011, 10:12am

Hi,

In a recent thread on the uwsgi mailing list[1], I began suspecting that
nginx will not honor an upstream’s Content-Length header. i.e., if an
upstream mentions a Content-Length of 1,000 bytes, but the connection is
broken after 500 bytes, nginx will still happily serve this entity with
a
200 OK status.

This may be a known bug in nginx, I wanted to be certain I indeed
understand
it correctly and raise the attention to it on the nginx mailing list -
because I think this is a very serious bug with potentially disastrous
consequences, as I describe below.

I was able to confirm this both for uwsgi_pass and proxy_pass; if the
upstream sets a Content-Length and then breaks the connection before
that
length was achieved, nginx will pass this onwards to the client.
Furthermore, since the upstream protocol is HTTP 1.0 but the
nginx-client
protocl is HTTP 1.1 (with keepalive), the request will simply not
terminate,
because the client can’t tell that the server has nothing more to send
and
nginx will not break the connection, despite the fact its connection
with
the upstream was broken and there’s no chance this request will ever be
fulfilled.

Things get far worse with gzip compression on - nginx will remove the
Content-Length header sent by the client and replace it with chunked
encoding - /incorrect chunked encoding/, that will make the client
believe
it has the full entity, even though it has only a part of this.

Think about this with regard to caching, ETags and transparent proxy
caching

if something like this happens to a cachable entity, especially one
with
an ETag, especially if a large ISP’s transparent proxy intercepts the
request - you might end up serving an incorrect representation of the
entity
for a very long time and for many thousands of requests (!).

Anyhow, I think the only sane resolution is that nginx will honor
upstream
Content-Length (and chunked encoding, if and when nginx will support
it),
and intentionally close the downstream connection prematurely in case
the
upstream connection is closed before the end of the Content-Length or
the
last chunk is received.

I suspect the recent work in nginx 1.1.4 for
ngx_http_upstream_keepalivecould be relevant here, but not sure,
didn’t read the code.

I’ll be happy to hear your thoughts or provide further data.

Yaniv

1: http://comments.gmane.org/gmane.comp.python.wsgi.uwsgi.general/2061

Yaniv_Aknin · September 30, 2011, 11:59am

Hello!

On Fri, Sep 30, 2011 at 11:11:53AM +0300, Yaniv Aknin wrote:

In a recent thread on the uwsgi mailing list[1], I began suspecting that
nginx will not honor an upstream’s Content-Length header. i.e., if an
upstream mentions a Content-Length of 1,000 bytes, but the connection is
broken after 500 bytes, nginx will still happily serve this entity with a
200 OK status.

Status code 200 is irrelevant - as it’s generally not possible to
know if connection will be broken in advance (i.e. before sending
status).

because the client can’t tell that the server has nothing more to send and
nginx will not break the connection, despite the fact its connection with
the upstream was broken and there’s no chance this request will ever be
fulfilled.

Things get far worse with gzip compression on - nginx will remove the
Content-Length header sent by the client and replace it with chunked
encoding - /incorrect chunked encoding/, that will make the client believe
it has the full entity, even though it has only a part of this.

Yes, this is a known problem. Upstream module expects backend to
behave properly, and if it misbehaves (or, more importantly,
connection is broken for some reason) bad things may happen.

Upstream’s module code needs carefull auditing to fix this. It’s
somewhere in my TODO (though not very high).

Maxim D.

Yaniv_Aknin · September 30, 2011, 4:36pm

Well, I appreciate the time every contributor to nginx put into the
project
and especially your and Igor’s work, and I know I don’t dictate the
priority
of your TODO, but I humbly think this is a far more serious issue than
you’re portraying.

Simply put, this bug can cause nginx to cause an otherwise noticeable
data
corruption to become a silent data corruption (including the serious
cache poisoning ramifications), and I think there are few bugs less
urgent
than that.

Anyhow, I’ve said all I have to say on the matter, so unless further
information comes up on this, I’ll leave this topic alone (at least
until
I’ll get off my bum and come back here with a suggested patch to fix
this…).

Thank you for your concise, timely and complete reply,

Yaniv

Yaniv_Aknin · September 30, 2011, 6:08pm

Sorry, I didn’t test FastCGI so can’t say for sure. I suspect it will
happen
there as well, but it’s just an assumption.

Yaniv

Yaniv_Aknin · September 30, 2011, 5:08pm

Hi Yaniv

Thanks for investigating and reporting on this. Do you know whether
the gzip chunked-encoding problem exists for fastcgi_pass as well?

Cheers
Thomas