Default value of gzip_proxied

I recently bumped into some trouble with a client caching uncompressed
data
without understanding where it came from.

After long investigation on what appeared to be random, I narrowed it to
the gzip_proxied
http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_proxied
directive. Return content from webserver was supposed to be always
compressed (as compressed data is generally better than uncompressed
whenever possible), but when requests coming from clients behind proxies
resulted in MISS, the returned content was uncompressed and stored as
such
in cache… thus serving cached uncompressed data to final clients.

​Why is the default value of that directive ‘off’? What is the problem
with
sneding compressed data to proxies? Why have you decided on such a
default
value?

Thanks,

B. R.

Hello!

On Fri, Mar 20, 2015 at 07:41:59PM +0100, B.R. wrote:

in cache… thus serving cached uncompressed data to final clients.

​Why is the default value of that directive ‘off’? What is the problem with
sneding compressed data to proxies? Why have you decided on such a default
value?

Because not all clients support compression, and it’s not possible
to instruct HTTP/1.0 proxies to serve compressed version only to
some clients. In HTTP/1.1 there is a Vary header for this, but
nevertheless it’s usually bad idea to use it as it causes huge
cache duplication.


Maxim D.
http://nginx.org/

Hello Maxim,

So HTTP/1.0 is the reason of all that.
Now I also understand why there are those parameters allowing to
compress
data that should not be cached: nginx as webserver tries to be smarter
than
those dumb HTTP/1.0 proxies.

I was wondering, though: are there real numbers to back this
compatibility
thing?
Is not there a point in time when the horizon could be set, denying
backwards compatibility for older software/standards?

HTTP/1.1, is the most used version of the protocol, nginx supports SPDY,
HTTP/2.0 is coming… and there are strangeness there for
backwards-compatibility with HTTP/1.0.
That behavior made us cache uncompressed content ‘randomly’ since the
pattern was hard to find/reproduce, and I got a bit of luck determining
the
condition under which we were caching uncompressed data…

What is the ratio benefits/costs of dropping compatibility (at least
partially) with HTTP/1.0?
I know I am being naive here, considering the most part of the Web is
HTTP/1.1-compliant, but how far am I for reality?

B. R.

Hello!

On Sat, Mar 21, 2015 at 04:05:05PM +0100, B.R. wrote:

backwards compatibility for older software/standards?
I know I am being naive here, considering the most part of the Web is
HTTP/1.1-compliant, but how far am I for reality?

There are two problems:

  • You assume HTTP/1.0 is dying. That’s not true. While uncommon
    nowadays for browsers, it’s still widely used by various
    software. In particular, nginx itself use it by default when
    talking to upstream servers.

  • You assume that the behaviour in question is only needed for
    HTTP/1.0 clients. That’s, again, not true, as using “Vary:
    Accept-Encoding”
    isn’t a good idea either. As already mentioned, even if
    correctly supported it will cause cache data duplication.

If you don’t like the behaviour, you can always configure nginx to
do whatever you want. But I don’t think the default worth
changing.


Maxim D.
http://nginx.org/

On 22.03.2015 3:31, Maxim D. wrote:

  • You assume that the behaviour in question is only needed for
    HTTP/1.0 clients. That’s, again, not true, as using “Vary: Accept-Encoding”
    isn’t a good idea either. As already mentioned, even if
    correctly supported it will cause cache data duplication.

If you don’t like the behaviour, you can always configure nginx to
do whatever you want. But I don’t think the default worth
changing.

If turn gunzip on - nginx can store in cache only one compressed
answer from origin server and can correctly provide uncompressed
content for proxies and clients, which not support compression ?


Best regards,
Gena

I do not get why you focus on the gzip_vary directive, while I was
explicitely talking about gzip_proxied.
The fact that content supposedly compressed might actually not be
because
it contains a ‘Via’ header is the root cause of our trouble… and you
just
told me it was for HTTP/1.0 compatibility.
This behavior, put aside the HTTP/1.0 compatibility, is strange and
disruptive at best.

I willingly join you on the fact that still a lot of software uses
HTTP/1.0, but I usually distinguish that from the reasons behind it and
what it should be.
I assume nginx defaults to talking HTTP/1.0 with backend because it is
the
lowest common denominator. That allows to handle outdated software and I
can understand that when you wish to be universal.

nginx seems to be stuck not knowing which way the wind is blowing,
sometimes promoting modernity and sometimes enforcing backwards (yes,
HTTP/1.0 means looking backwards) compatibility.
​While setting default values to be interoperable the most, which I
understand perfectly, ​there should be somewhere bright pointers about
the
fact that some directives only exists for such reasons. I would be more
than welcoming that defualt configuration introduces commented examples
about what modern configuration/usage of nginx shall be.

‘gzip on’ clearly is clearly not enough if you want to send compressed
content. How much people know about it? ‘RTFM’ stance is no longer valid
when multiple directives shall be activated at once on a modern
infrastructure. nginx configuration was supposed to be lean and clean.
It
is, provided that you use outdate protocol to serve content: minimal
configuration for compatibility is smaller than the one for modern
protocols… and you need to dig by yourself to learn that. WTF?

B. R.

Hello!

On Sun, Mar 22, 2015 at 03:14:22PM +0100, B.R. wrote:

I do not get why you focus on the gzip_vary directive, while I was
explicitely talking about gzip_proxied.
The fact that content supposedly compressed might actually not be because
it contains a ‘Via’ header is the root cause of our trouble… and you just
told me it was for HTTP/1.0 compatibility.

With HTTP/1.0, there is only one safe option:

  • don’t compress anything for proxies.

With HTTP/1.1, there are two options:

  • don’t compress anything for proxies;

  • compress for proxies, but send Vary to avoid incorrect behaviour.

The second options, which becomes available if you don’t care
about HTTP/1.0 compatibility at all, has its downsides I’ve
talked about.


Maxim D.
http://nginx.org/

Hello!

On Sun, Mar 22, 2015 at 04:20:12AM +0200, Gena M. wrote:

If turn gunzip on - nginx can store in cache only one compressed
answer from origin server and can correctly provide uncompressed
content for proxies and clients, which not support compression ?

Yes, though this requires some special configuration. In
particular, you have to instruct your backend to return gzip
(usually by “proxy_set_header Accept-Encoding gzip;” in nginx).
Additionally, if your backen returns “Vary: Accept-Encoding”,
you’ll have to instruct nginx to ignore it when using nginx 1.7.7+
(“proxy_ignore_headers Vary”).


Maxim D.
http://nginx.org/

Hi Maxim,

There is still something I do not get…

The gzip_proxied
http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_proxied
default value is set to honor the HTTP/1.0 protocol (which does not have
the Vary header and thus is unable to cache different versions of a
document) in some proxies.
However, the gzip_http_version
http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_http_version
default value is set so that only HTTP/1.1 requests are being
compressed…
Thus with the default setting it is impossible to compress requests
advertising HTTP/1.0.

The RFC
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-14#section-2.5
dictates:

Intermediaries that process HTTP messages (i.e., all intermediaries
other than those acting as a tunnel) MUST send their own HTTP-Version
in forwarded messages. In other words, they MUST NOT blindly forward
the first line of an HTTP message without ensuring that the protocol
version matches what the intermediary understands, and is at least
conditionally compliant to, for both the receiving and sending of
messages.

‘tunnel’ is considered different as a ‘proxy’, as section 2.3
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-14#section-2.3
indicates:

There are three common forms of HTTP
intermediary: proxy, gateway, and tunnel.

Thus, any HTTP/1.0 proxy should be seen with a HTTP/1.0 protocol version
header… and thus should naturally get an uncompressed version of the
page.

Non-compliant proxies can be bogus in thousands of way, so there is no
point in trying to satisfy them anyway.

In the light of these elements, I am still wondering why the default
behavior of the gzip module for HTTP/1.1 requests going through a
(HTTP/1.1) proxy is to send a disturbing uncompressed version of the
page.

B. R.

Hello!

On Tue, Mar 24, 2015 at 07:11:17PM +0100, B.R. wrote:

Hi Maxim,

There is still something I do not get…

The gzip_proxied
http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_proxied
default value is set to honor the HTTP/1.0 protocol (which does not have
the Vary header and thus is unable to cache different versions of a
document) in some proxies.

You are still misunderstanding things. It’s one of the two
possible approaches to handle things even if we forget about
HTTP/1.0 completely.

Intermediaries that process HTTP messages (i.e., all intermediaries
other than those acting as a tunnel) MUST send their own HTTP-Version
in forwarded messages. In other words, they MUST NOT blindly forward
the first line of an HTTP message without ensuring that the protocol
version matches what the intermediary understands, and is at least
conditionally compliant to, for both the receiving and sending of
messages.

As you can see from the paragraph you quoted, nginx only knows
HTTP version of the intermediary it got the request from. That
is, there is no guarantee that there are no HTTP/1.0 proxies along
the request/response chain.


Maxim D.
http://nginx.org/

Hi,

The gzip_proxied
http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_proxied
default value is set to honor the HTTP/1.0 protocol (which does not have
the Vary header and thus is unable to cache different versions of a
document) in some proxies.

You are still misunderstanding things. It’s one of the two
possible approaches to handle things even if we forget about
HTTP/1.0 completely.

​Well the 2 only possible approaches in 1.0 are to send compressed or
uncompressed data.
Client supporting compressed version​

​will understand the uncompressed one but the reverse might not be true.
So if you are not able to make the answer being served from cache to
vary
(which is the case in 1.0), you are actually stuck with a single option,
the only common denominator: no compression at all. Right?​

draft-ietf-httpbis-p1-messaging-14>
As you can see from the paragraph you quoted, nginx only knows
HTTP version of the intermediary it got the request from. That
is, there is no guarantee that there are no HTTP/1.0 proxies along
the request/response chain.

​The text I quoted means than at the end of a chain of intermediaries,
you
are ensured that you will end up with the greatest common denominator,
ie
if a single element of the intermediaries chain does not handle 1.1, he
his
required to forward the request with a 1.0 version header, which will
then
be left untouched by following intermediaries (as 1.0 is the smallest
version available)​.
An intermediary seing a 1.1 request coming in but not supporting that
version is required to step down to a version it understands, meaning
1.0.
It should not forward 1.1.

If nginx sees 1.1 coming, it is my understanding that every intermediary
supports at least 1.1, whatever number of intermediaries we are
talking
about.

​What is it I do not get?

B. R.

Hello Maxim,

Thanks for having taken the time of explaining it all!
It seems HTTP/1.0 interoperability is seriously flawed…

Anyhow, now I understand nginx’ default behavior, which makes sense.
​Our needs are very specific, since nginx is hidden behind an internal
cache, but the general case, real front-end, is safe. :o)​

Thanks again,

B. R.

Hello!

On Tue, Mar 24, 2015 at 11:59:24PM +0100, B.R. wrote:

​Well the 2 only possible approaches in 1.0 are to send compressed or
uncompressed data.
Client supporting compressed version​

​will understand the uncompressed one but the reverse might not be true.
So if you are not able to make the answer being served from cache to vary
(which is the case in 1.0), you are actually stuck with a single option,
the only common denominator: no compression at all. Right?​

Yes, the only option if we care about HTTP/1.0 is to avoid
compression for proxies.

draft-ietf-httpbis-p1-messaging-14>
As you can see from the paragraph you quoted, nginx only knows
version available)​.
An intermediary seing a 1.1 request coming in but not supporting that
version is required to step down to a version it understands, meaning 1.0.
It should not forward 1.1.

If nginx sees 1.1 coming, it is my understanding that every intermediary
supports at least 1.1, whatever number of intermediaries we are talking
about.

No. The text ensures that nginx will see HTTP/1.0 if the last
proxy doesn’t understand HTTP/1.1. There is no requirement to
preserve supported versions untouched. Moreover, first sentence
requires intermediaries to use their own version. And RFC2616
explicitly requires the same,

Due to interoperability problems with HTTP/1.0 proxies discovered
since the publication of RFC 2068[33], caching proxies MUST, gateways
MAY, and tunnels MUST NOT upgrade the request to the highest version
they support.

That is, in a request/response chain like this:

client -> proxy1 -> proxy2 -> nginx

If proxy1 supports only HTTP/1.0, but proxy2 supports HTTP/1.1,
nginx will see an HTTP/1.1 request.


Maxim D.
http://nginx.org/