Gunzip filter module 0.3

Hello!

Changes with gunzip module 0.3 (2010-03-22):

*) Bugfix: the "[alert] zero size buf" error during gunzipping of
   some files.
   Thanks to Matthieu T..

*) Bugfix: incorrect code was used for nginx 0.8.0 .. 0.8.24.  The 

bug
had appeared in gunzip module 0.2.

Repository is available here:

http://mdounin.ru/hg/ngx_http_gunzip_filter_module/

Source tarball may be obtained here:

http://mdounin.ru/files/ngx_http_gunzip_filter_module-0.3.tar.gz

MD5 (ngx_http_gunzip_filter_module-0.3.tar.gz) =
32bb38f341786b042009e236c59cf1cc
SHA256 (ngx_http_gunzip_filter_module-0.3.tar.gz) =
c1860befc868970dce4085631e7717c749329883d8b05a2a64bd2850f346cd39

Maxim D.

Ver interesting, Maxim.

I was wondering if this works with proxy_cache… will it unzip
compressed responses from cache?

If so, it would make a lot of sense to write something that would
compress items before putting them into cache, even if the backend
has not compressed them. And we could use a high compress-level if the
item is cachable. Since 95% of clients support gzip, this would seem
to be the ideal way to run an HTTP proxy (compression is done once per
stored object, and decompressed on the fly if necessary for a
particular client).

Any thoughts on that? Would an additional proxy_cache_gzip module be a
good idea, or would it have to be a patch to the main proxy_cache
module?

I suppose something similar could be done right now by having an extra
proxy layer (a “fetcher/compressor” layer) behind the cache, but that
seems like a hack.

On Mon, Mar 22, 2010 at 1:20 PM, Maxim D. [email protected]
wrote:

Maxim D.


nginx mailing list
[email protected]
nginx Info Page


RPM

Hello!

On Tue, Mar 23, 2010 at 08:13:10AM -0500, Ryan M. wrote:

Ver interesting, Maxim.

I was wondering if this works with proxy_cache… will it unzip
compressed responses from cache?

Yes.

module?
I belive proxy_cache must cache response as it was got from
upstream. It is not it’s business to compress or change anything,
there are output filters to do changes.

On the other hand it is believed to be good idea to implement
cache support in gzip filter. I.e. gzip filter will cache gzipped
content and will send it to client instead of re-compressing it.
And it’s actually in Igor’s plans AFAIK, but most likely not near
plans.

I suppose something similar could be done right now by having an extra
proxy layer (a “fetcher/compressor” layer) behind the cache, but that
seems like a hack.

Yes, this can be done easily with additional proxy layer.

Maxim D.

Hello!

On Wed, Mar 24, 2010 at 07:48:13AM -0500, Ryan M. wrote:

that is so, why is it not the business of proxy_cache to transform
content (it already manipulates headers out of necessity)?

Yes, I’m referring to nginx code. The main problem with upstream
module is that it does too many things already. And teaching it
to do things which may (and should) be done elsewhere is really
bad idea.

Re-applying the same output filter repeatedly is wasteful and
increases latency. If Igor is worried about the impact updating HTTP
date strings more than once per second, surely avoiding thousands of
loops through a gzip filter is an optimization that would be smiled
upon?

While re-gzipping indeed costly, just extracting pre-gzipped
content from gzip_cache isn’t much different from extracting
pre-gzipped content from proxy_cache. On the other hand
gzip_cache will be able to use pre-gzipped content for much more
things.

Even Microsoft gets this specific part right (static content is cached
in its compressed state in IIS, and can use a different compression
ratio from dynamic content).

Yes, and gzip_cache will allow us to do the same thing. And won’t
be tightly coupled with proxy.

Though right now one may pre-gzip static files and use gzip_static - I
believe transparent gzip_cache will be better.

infrastructure as proxy_cache already has, so there would be a lot of
unnecessary code duplication if the mechanism was separate from
proxy_cache. But it would be more general (for nginx) to have a
separate standalone gzip_cache module.

Cache infrastracture was designed with many consumers in mind (and
e.g. slowfs_cache module by Piotr S. uses it), so
infrastructure is mostly present. What we have to do is to teach
gzip to save it’s responses to cache.

Maxim D.

On Wednesday, March 24, 2010, Maxim D. [email protected] wrote:

Cache infrastracture was designed with many consumers in mind (and
e.g. slowfs_cache module by Piotr S. uses it), so
infrastructure is mostly present. Â What we have to do is to teach
gzip to save it’s responses to cache.

Sounds like a good approach. My C is very rusty and never that good to
begin with, but I can help with specs, docs, testing, etc.

By the way, does anyone maintain an official public nginx repository?
I see some Git clones, and module writers all use varying SCM. I favor
Hg myself, but I remember reading Igor uses SVN privately…


RPM

On Tue, Mar 23, 2010 at 8:46 AM, Maxim D. [email protected]
wrote:

I belive proxy_cache must cache response as it was got from
upstream. Â It is not it’s business to compress or change anything,
there are output filters to do changes.

Transformation of content by proxies and caches is specifically
allowed in the HTTP specs, unless a “Cache-Control: no-transform”
directive is present.

Or were you referring to the nginx architecture/code specifically? If
that is so, why is it not the business of proxy_cache to transform
content (it already manipulates headers out of necessity)?
Re-applying the same output filter repeatedly is wasteful and
increases latency. If Igor is worried about the impact updating HTTP
date strings more than once per second, surely avoiding thousands of
loops through a gzip filter is an optimization that would be smiled
upon?

Even Microsoft gets this specific part right (static content is cached
in its compressed state in IIS, and can use a different compression
ratio from dynamic content).

On the other hand it is believed to be good idea to implement
cache support in gzip filter. Â I.e. gzip filter will cache gzipped
content and will send it to client instead of re-compressing it.
And it’s actually in Igor’s plans AFAIK, but most likely not near
plans.

Integrating the compression with the “retrieval” portion of the cache
code would allow for the use of high compression ratios for long-lived
objects, as well as prevent duplication of data on disk. Also, any
caching mechanism is going to need the same quantity of settings and
infrastructure as proxy_cache already has, so there would be a lot of
unnecessary code duplication if the mechanism was separate from
proxy_cache. But it would be more general (for nginx) to have a
separate standalone gzip_cache module.

RPM

Hi,

my company working with ad insertion on Internet traffic.
For that I’m trying to use nginx. Some text/html traffic goes with
Content-encoding: gzip.
So to parse html inside this traffic I’ve tried gunzip. And it partially
works with some hacks. One of my problem was function ngx_http_gzip_ok
which returns NGX_OK if in headers doesn’t exists Via header, all
traffic which I’m passing throw gunzip don’t have this header.

My question, is it correct to use gunzip module in my case, or better to
try another module?
If my decision is correct, may I send you my changes for gunzip to add
for it functionality which my company need?

Posted at Nginx Forum:

Hello!

On Wed, Mar 24, 2010 at 05:49:32PM -0500, Ryan M. wrote:

[…]

By the way, does anyone maintain an official public nginx repository?
I see some Git clones, and module writers all use varying SCM. I favor
Hg myself, but I remember reading Igor uses SVN privately…

There is no official one (as Igor’s svn isn’t public). I maintain
mercurial repos with all public releases imported here:

http://mdounin.ru/hg/nginx-vendor-current
http://mdounin.ru/hg/nginx-vendor-0-7
http://mdounin.ru/hg/nginx-vendor-0-6
http://mdounin.ru/hg/nginx-vendor-0-5

Maxim D.

Perfect Max,

understood your style of module, right now I’m working hard to deploy it
just with small hacks.

Actually we don’t need to do unzipping always, we need unzip only for
200 upstream responses and only for text/html answers for reducing load
on server. Looks like better to have coordination with your way of
development, so I need small instructions how better to do it, and I’ll
send my patch for it.


/home/roman/work/ngx_http_gunzip_filter_module-0.3/ngx_http_gunzip_filter_module.c
2010-03-22 11:11:16.000000000 -0700
+++ ngx_http_gunzip_filter_module.c 2010-04-16 16:37:01.000000000 -0700
@@ -132,6 +132,7 @@
if (!conf->enable
|| r->headers_out.content_encoding == NULL
|| r->headers_out.content_encoding->value.len != 4

  •    || r->upstream->state->status != 200
       || ngx_strncasecmp(r->headers_out.content_encoding->value.data,
                          (u_char *) "gzip", 4) != 0)
    

    {
    @@ -142,6 +143,9 @@

    r->gzip_vary = 1;

  • r->gzip_tested = 1;

  • r->gzip_ok = 1;

  • if (!r->gzip_tested) {
    if (ngx_http_gzip_ok(r) == NGX_OK) {
    return ngx_http_next_header_filter(r);
    @@ -315,7 +319,7 @@
    ctx->zstream.opaque = ctx;

    /* windowBits +16 to decode gzip, zlib 1.2.0.4+ */

  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 16);
  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 32); // yahoo looks
    weird with previous init

    if (rc != Z_OK) {
    ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,

If not apply r->upstream->state->status != 200 in headers processing I’m
getting a lot of errors in log, one of it is
Ой! , which sends 302 redirect url
with gzipped content, I’ve tried to fix it, but found just error in
zlib, when I’ve stored dumped data and used ‘gzip -d’ on it all
decompressed fine, and I’ve got normal HTML. How better to debug it?
What advice you can give me?

sincerely Roman V.

Posted at Nginx Forum:

Hello!

On Fri, Apr 16, 2010 at 03:14:54PM -0400, theromis1 wrote:

better to try another module?
If my decision is correct, may I send you my changes for gunzip
to add for it functionality which my company need?

As far as I understand, you need it to gunzip data in all cases to
make sure another filter module will be able to modify response
correctly, right?

Yes, it’s intended use case, though it’s not implemented yet. And
you may note the following comment in gunzip filter:

/* TODO always gunzip - due to configuration or module request */

It should be trivial to support at least something like “gunzip
always;” which will gunzip all gzipped responses without checking
what ngx_http_gzip_ok() thinks about client’s support for gzip.
Feel free to submit patches.

Maxim D.

p.s. You may want to use mailing list instead of forum-based
gateway, forum is known to do awful things with messages.

Hello!

On Mon, Apr 19, 2010 at 05:15:16PM -0400, theromis1 wrote:

— /home/roman/work/ngx_http_gunzip_filter_module-0.3/ngx_http_gunzip_filter_module.c 2010-03-22 11:11:16.000000000 -0700
+++ ngx_http_gunzip_filter_module.c 2010-04-16 16:37:01.000000000 -0700
@@ -132,6 +132,7 @@
if (!conf->enable
|| r->headers_out.content_encoding == NULL
|| r->headers_out.content_encoding->value.len != 4

  •    || r->upstream->state->status != 200
    

This is obviously wrong.

  1. Nobody promised r->upstream is here. Expect coredumps on
    static requests and/or internal error responses.

  2. Unzipping only responses with status 200 isn’t going to work as
    long as client doesn’t support gzip at all.

If your module happens to process only 200 responses - well, it
should be considered to be “module request” and coded as such.
Alternatively there may be some settings to request “gunzip
always” only for particular responses, but I tend to think it’s
overkill.

     || ngx_strncasecmp(r->headers_out.content_encoding->value.data,
                        (u_char *) "gzip", 4) != 0)
 {

@@ -142,6 +143,9 @@

 r->gzip_vary = 1;
  • r->gzip_tested = 1;
  • r->gzip_ok = 1;

No, you shouldn’t modify nginx idea if client supports gzip.
Instead, you should bypass the whole detection logic if you need
to gunzip regardless of client’s support.

And you code suggests that further tests will assume client
supports gzip, while some don’t. This may lead to wierd results
if you have gzip filter enabled.

 if (!r->gzip_tested) {
     if (ngx_http_gzip_ok(r) == NGX_OK) {
         return ngx_http_next_header_filter(r);

@@ -315,7 +319,7 @@
ctx->zstream.opaque = ctx;

 /* windowBits +16 to decode gzip, zlib 1.2.0.4+ */
  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 16);
  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 32); // yahoo looks weird with previous init

+32 means decode zlib stream, which isn’t what expected with gzip
content-encoding; it’s content-encoding deflate. And there are
differencies.

How better to debug it? What advice you can give me?
They return incorrect data in reply:

00000000 1f 8b 08 00 00 00 00 00 00 03 02 00 00 00 ff ff
|…|
00000010 1f 8b 08 00 00 00 00 00 00 03 2d 8e bb 0e 82 40
|…-…@|
00000020 10 45 7b be 62 a4 b0 d3 51 28 1d d6 44 c1 68 e2
|.E{.b…Q(…D.h.|
00000030 ab 58 0b cb 95 1d b3 46 58 08 2c 46 fe 5e 1e 76
|.X…FX.,F.^.v|
00000040 33 73 ee e4 5c 9a c4 97 ad bc 5f 13 d8 cb d3 11
|3s…_…|
00000050 ae b7 cd f1 b0 05 7f 86 78 48 e4 0e 31 96 f1 48
|…xH…1…H|
00000060 82 f9 02 31 39 fb c2 23 e3 f2 4c 90 61 a5 bb c5
|…19…#…L.a…|
00000070 bd 5c c6 22 5c 04 b0 2b 1a ab 09 c7 83 47 38 04
|…"…+…G8.|
00000080 e8 51 e8 b6 ff 59 8a 3f ef 26 8f 4a 21 0d 83 2e
|.Q…Y.?.&.J!..|
00000090 d2 26 67 eb c0 a8 1a f2 e2 c3 1a 48 81 a9 f8 19
|.&g…H…|
000000a0 f9 d8 2a ab 6b 56 55 6a d6 8e bf 2e aa 1b fb 66
|…*.kVUj…f|
000000b0 3b 55 79 b9 ca aa 28 58 86 be 30 5c 31 a1 12 73
|;Uy…(X…0\1…s|
000000c0 c2 b2 37 0e ae ce d0 f7 f3 7e 75 a4 7e 57 da 00
|…7…~u.~W…|
000000d0 00 00 |…|
000000d2

First 16 bytes are incomplete/broken gzip member. Correct one is
at offset 0x10 (and it indeed may be decoded to valid html).

It’s intresting how they achieved this. Hey, anybody from Yandex
here? Comments?

Maxim D.

Hello!

On Tue, Apr 20, 2010 at 07:00:49PM -0400, theromis1 wrote:

gzip, +32 will take both.
I perfectly understand what +32 does. I just don’t think that it
should be done. Headers indicate gzip, and data we see isn’t
gzip. So it’s an error and it should be reported.

We have huge traffic, and throw this module goes a lot of sites
which don’t checking RFC browser supports both of it, I’m just
counting alerts in error log and with +32 this amount much
smaller.

Just a note: if you are trying to proxy the whole internet you
probably choose a wrong tool. nginx was never designed to be
generic forward-proxy, it’s reverse proxy and it expects
controllable backends. And it has quite a few limitations which
render it hardly usable as forward proxy (including fully buffered
request messages, no http/1.1 support to upstreams, no protection
for X-Accel-* response headers and so on).

Regarding yandex stuff, I don’t know why they made it this way,
but they sending 3 chunks: 1st is 16 bytes weird gzip header and
nothing more, next chunk correct html content, and 3rd one is
zero size.

Yes, when we are talking about HTTP/1.1 and chunked encoding.
Last two chunks are normal (actual content + final chunk), but the
first one is broken.

Idiotic, but browser can handle it.

No they can’t. They just used to ignore message body when
handling redirects.

May we work this way, try to
gunzip if we got error try to gunzip next chunk, or if we have
fatal error just pass whole stream unchanged, at least on first
chunk?

While it is technically possible to do so via postponing sending
header until after we see some body parts (e.g. xslt filter does
this) - it is a bit tricky and will had some bad implications with
current nginx code, e.g. 304 responses will read body from static
files instead of just doing stat() and so on.

Sure these all can be resolved, but I don’t see any benefits for
supported configurations here, only for unsupported forward proxy
case.

} ngx_http_gunzip_conf_t;
+static ngx_conf_bitmask_t ngx_http_gzip_pass_mask[] = {
offsetof(ngx_http_gunzip_conf_t, bufs),

  •  NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_1MORE,
    
  •  ngx_http_types_slot,
    
  •  NGX_HTTP_LOC_CONF_OFFSET,
    
  •  offsetof(ngx_http_gunzip_conf_t, types_keys),
    
  •  &ngx_http_html_default_types[0] },
    

I don’t like naming and syntax, sorry. Suffix “_pass” is used for
backend modules (proxy_pass, memcached_pass, fastcgi_pass), but I
belive it should be handled by “gunzip” directive itself anyway.

Directive “gunzip_filter_types” should be “gunzip_types”, and I think it
should be “*” by default. I.e. gunzip all types by default, and
make it possible to reduce the list. This way types is completely
orthogonal to other settings.

   ngx_null_command

};

@@ -130,6 +158,10 @@
/* TODO ignore content encoding? */

 if (!conf->enable
  • || ((conf->pass & NGX_HTTP_GUNZIP_PASS_200 ) && (r->upstream)
  • && (r->upstream->state) && (r->upstream->state->status != 200))
    

I still doesn’t understand why do you check
r->upstream->state->status. There is r->status.

And as I already said I don’t like the whole idea of “gunzipping
only 200 responses”.

  • {
  • } else if (!r->gzip_ok) {
  • }
    Don’t hesitate to use goto.
  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 32);
    +#if (nginx_version < 8000)
    Relevant change was introduced in 0.8.29.
  • {
  •    return NGX_CONF_ERROR;
    
  • }
  • return NGX_CONF_OK;
    }

Maxim D.

ok,

I’ve created patch which must feet nginx rules and my needs, please find
it below.
when I’ve checked sources of zlib.h I found this “windowBits can also be
greater than 15 for optional zip decoding. Add 32 to windowBits to
enable zlib and gzip decoding with automatic header detection, or add 16
to decode only the gzip format (the zlib format will return a
Z_DATA_ERROR).”
As I can understand if pass here +16 it will work only with gzip, +32
will take both.

We have huge traffic, and throw this module goes a lot of sites which
don’t checking RFC browser supports both of it, I’m just counting alerts
in error log and with +32 this amount much smaller.

Regarding yandex stuff, I don’t know why they made it this way, but they
sending 3 chunks: 1st is 16 bytes weird gzip header and nothing more,
next chunk correct html content, and 3rd one is zero size.

Idiotic, but browser can handle it. May we work this way, try to gunzip
if we got error try to gunzip next chunk, or if we have fatal error just
pass whole stream unchanged, at least on first chunk?

My patch follows:

/root/work/ngx_http_gunzip_filter_module-0.3/ngx_http_gunzip_filter_module.c
2010-03-22 11:11:16.000000000 -0700
+++ ngx_http_gunzip_filter_module.c 2010-04-20 15:59:01.000000000 -0700
@@ -16,6 +16,9 @@
typedef struct {
ngx_flag_t enable;
ngx_bufs_t bufs;

  • ngx_uint_t pass;
  • ngx_hash_t types;
  • ngx_array_t *types_keys;
    } ngx_http_gunzip_conf_t;

@@ -40,6 +43,17 @@
ngx_http_request_t *request;
} ngx_http_gunzip_ctx_t;

+#define NGX_HTTP_GUNZIP_PASS_ANY 0x02
+#define NGX_HTTP_GUNZIP_PASS_CONTENT_TYPE 0x04
+#define NGX_HTTP_GUNZIP_PASS_200 0x08
+
+static ngx_conf_bitmask_t ngx_http_gzip_pass_mask[] = {

  • { ngx_string(“any”), NGX_HTTP_GUNZIP_PASS_ANY },
  • { ngx_string(“content_type”), NGX_HTTP_GUNZIP_PASS_CONTENT_TYPE },
  • { ngx_string(“200”), NGX_HTTP_GUNZIP_PASS_200 },
  • { ngx_null_string, 0 }
    +};

static ngx_int_t
ngx_http_gunzip_filter_inflate_start(ngx_http_request_t *r,
ngx_http_gunzip_ctx_t *ctx);
@@ -78,6 +92,20 @@
offsetof(ngx_http_gunzip_conf_t, bufs),
NULL },

  • { ngx_string(“gunzip_pass”),

NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_1MORE,

  •  ngx_conf_set_bitmask_slot,
    
  •  NGX_HTTP_LOC_CONF_OFFSET,
    
  •  offsetof(ngx_http_gunzip_conf_t, pass),
    
  •  &ngx_http_gzip_pass_mask },
    
  • { ngx_string(“gunzip_filter_types”),

NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_1MORE,

  •  ngx_http_types_slot,
    
  •  NGX_HTTP_LOC_CONF_OFFSET,
    
  •  offsetof(ngx_http_gunzip_conf_t, types_keys),
    
  •  &ngx_http_html_default_types[0] },
    
  •  ngx_null_command
    

};

@@ -130,6 +158,10 @@
/* TODO ignore content encoding? */

 if (!conf->enable
  • || ((conf->pass & NGX_HTTP_GUNZIP_PASS_200 ) && (r->upstream)
  • && (r->upstream->state) && (r->upstream->state->status != 200))
    
  • || ((conf->pass & NGX_HTTP_GUNZIP_PASS_CONTENT_TYPE)
  •       && ngx_http_test_content_type(r, &conf->types) == NULL)
       || r->headers_out.content_encoding == NULL
       || r->headers_out.content_encoding->value.len != 4
       || ngx_strncasecmp(r->headers_out.content_encoding->value.data,
    

@@ -138,26 +170,29 @@
return ngx_http_next_header_filter(r);
}

  • if (! conf->pass & NGX_HTTP_GUNZIP_PASS_ANY )
  • {
    #if (nginx_version >= 8025 || (nginx_version >= 7065 && nginx_version <
    8000))
  • r->gzip_vary = 1;
  • if (!r->gzip_tested) {
  •    if (ngx_http_gzip_ok(r) == NGX_OK) {
    
  •        return ngx_http_next_header_filter(r);
    
  •    }
    
  •  r->gzip_vary = 1;
    
  • } else if (!r->gzip_ok) {
  •    return ngx_http_next_header_filter(r);
    
  • }
  •  if (!r->gzip_tested) {
    
  •      if (ngx_http_gzip_ok(r) == NGX_OK) {
    
  •          return ngx_http_next_header_filter(r);
    
  •      }
    
  •  } else if (!r->gzip_ok) {
    
  •      return ngx_http_next_header_filter(r);
    
  •  }
    

#else

  • if (ngx_http_gzip_ok(r) == NGX_OK) {
  •    return ngx_http_next_header_filter(r);
    
  • }
  •  if (ngx_http_gzip_ok(r) == NGX_OK) {
    
  •      return ngx_http_next_header_filter(r);
    
  •  }
    

#endif

  • }

    ctx = ngx_pcalloc(r->pool, sizeof(ngx_http_gunzip_ctx_t));
    if (ctx == NULL) {
    @@ -314,8 +349,8 @@
    ctx->zstream.zfree = ngx_http_gunzip_filter_free;
    ctx->zstream.opaque = ctx;

  • /* windowBits +16 to decode gzip, zlib 1.2.0.4+ */
  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 16);
  • /* windowBits +32 to decode gzip and zlib, zlib 1.2.0.4+ */

  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 32);

    if (rc != Z_OK) {
    ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,
    @@ -669,6 +704,24 @@
    ngx_conf_merge_bufs_value(conf->bufs, prev->bufs,
    (128 * 1024) / ngx_pagesize,
    ngx_pagesize);

  • ngx_conf_merge_bitmask_value(conf->pass, prev->pass,

  •                          NGX_CONF_BITMASK_SET);
    

+#if (nginx_version < 8000)

  • if (ngx_http_merge_types(cf, conf->types_keys, &conf->types,
  •                         prev->types_keys, &prev->types,
    
  •                         ngx_http_html_default_types)
    
  •    != NGX_OK)
    

+#else

  • if (ngx_http_merge_types(cf, &conf->types_keys, &conf->types,
  •                         &prev->types_keys, &prev->types,
    
  •                         ngx_http_html_default_types)
    
  •    != NGX_OK)
    

+#endif

  • {
  •    return NGX_CONF_ERROR;
    
  • }
  • return NGX_CONF_OK;
    }

Posted at Nginx Forum:

Hi Max,

today I’ve done with your last suggestions. Attaching new patch.
Also I’ve added option “gunzip_strict” when it turned off it adding 32
instead of 16 on inflateInit2 and in case of inflate error in first
chunk, sending content unchanged. Please review it and give me
recommendations. Default is on.

About gunzip_types * by default, in current development version on Nginx
I can not find ability to remove or replace already existing string
hashes using ngx_http_types_slot. That was reason for having option
content_type, by default like you said all mime types is goes to
guunzip. If you can show me example in nginx of logic which you want
realize, I’ll be happy to make it.

BTW, what do you think if I will add option in mod_proxy for disabling
requests buffering and add support for proxing connection: keep-alive?

— ngx_http_gunzip_filter_module-0.3/ngx_http_gunzip_filter_module.c
2010-03-22 11:11:16.000000000 -0700
+++ ngx_http_gunzip_filter_module.c 2010-04-22 18:24:54.000000000 -0700
@@ -16,10 +16,15 @@
typedef struct {
ngx_flag_t enable;
ngx_bufs_t bufs;

  • ngx_uint_t pass;
  • ngx_hash_t types;
  • ngx_array_t *types_keys;
  • ngx_flag_t strict;
    } ngx_http_gunzip_conf_t;

typedef struct {

  • ngx_chain_t *_in;
    ngx_chain_t *in;
    ngx_chain_t *free;
    ngx_chain_t *busy;
    @@ -35,11 +40,24 @@
    unsigned redo:1;
    unsigned done:1;
    unsigned nomem:1;

  • unsigned first:1;

  • unsigned strict:1;

    z_stream zstream;
    ngx_http_request_t *request;
    } ngx_http_gunzip_ctx_t;

+#define NGX_HTTP_GUNZIP_PASS_ANY 0x02
+#define NGX_HTTP_GUNZIP_PASS_CONTENT_TYPE 0x04
+#define NGX_HTTP_GUNZIP_PASS_200 0x08
+
+static ngx_conf_bitmask_t ngx_http_gzip_pass_mask[] = {

  • { ngx_string(“any”), NGX_HTTP_GUNZIP_PASS_ANY },
  • { ngx_string(“content_type”), NGX_HTTP_GUNZIP_PASS_CONTENT_TYPE },
  • { ngx_string(“200”), NGX_HTTP_GUNZIP_PASS_200 },
  • { ngx_null_string, 0 }
    +};

static ngx_int_t
ngx_http_gunzip_filter_inflate_start(ngx_http_request_t *r,
ngx_http_gunzip_ctx_t *ctx);
@@ -78,6 +96,27 @@
offsetof(ngx_http_gunzip_conf_t, bufs),
NULL },

  • { ngx_string(“gunzip_options”),

NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_1MORE,

  •  ngx_conf_set_bitmask_slot,
    
  •  NGX_HTTP_LOC_CONF_OFFSET,
    
  •  offsetof(ngx_http_gunzip_conf_t, pass),
    
  •  &ngx_http_gzip_pass_mask },
    
  • { ngx_string(“gunzip_strict”),

NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_FLAG,

  •  ngx_conf_set_flag_slot,
    
  •  NGX_HTTP_LOC_CONF_OFFSET,
    
  •  offsetof(ngx_http_gunzip_conf_t, strict),
    
  •  NULL },
    
  • { ngx_string(“gunzip_types”),

NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_1MORE,

  •  ngx_http_types_slot,
    
  •  NGX_HTTP_LOC_CONF_OFFSET,
    
  •  offsetof(ngx_http_gunzip_conf_t, types_keys),
    
  •  &ngx_http_html_default_types[0] },
    
  •  ngx_null_command
    

};

@@ -130,6 +169,10 @@
/* TODO ignore content encoding? */

 if (!conf->enable
  • || ((conf->pass & NGX_HTTP_GUNZIP_PASS_200 ) && (r->upstream)
  •       && (r->upstream->state) && (r->upstream->state->status != 
    

200))

  • || ((conf->pass & NGX_HTTP_GUNZIP_PASS_CONTENT_TYPE)
  •       && ngx_http_test_content_type(r, &conf->types) == NULL)
       || r->headers_out.content_encoding == NULL
       || r->headers_out.content_encoding->value.len != 4
       || ngx_strncasecmp(r->headers_out.content_encoding->value.data,
    

@@ -138,6 +181,8 @@
return ngx_http_next_header_filter(r);
}

  • if ( conf->pass & NGX_HTTP_GUNZIP_PASS_ANY )
  •    goto ok;
    

#if (nginx_version >= 8025 || (nginx_version >= 7065 && nginx_version <
8000))

 r->gzip_vary = 1;

@@ -159,6 +204,7 @@

#endif

+ok:
ctx = ngx_pcalloc(r->pool, sizeof(ngx_http_gunzip_ctx_t));
if (ctx == NULL) {
return NGX_ERROR;
@@ -167,6 +213,8 @@
ngx_http_set_ctx(r, ctx, ngx_http_gunzip_filter_module);

 ctx->request = r;
  • ctx->strict = conf->strict;

  • ctx->first = 1;

    r->filter_need_in_memory = 1;

@@ -206,6 +254,7 @@
if (ngx_chain_add_copy(r->pool, &ctx->in, in) != NGX_OK) {
goto failed;
}

  • ctx->_in=in;
    }

    if (ctx->nomem) {
    @@ -314,8 +363,8 @@
    ctx->zstream.zfree = ngx_http_gunzip_filter_free;
    ctx->zstream.opaque = ctx;

  • /* windowBits +16 to decode gzip, zlib 1.2.0.4+ */
  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + 16);
  • /* windowBits +32 to decode gzip and zlib, zlib 1.2.0.4+ */

  • rc = inflateInit2(&ctx->zstream, MAX_WBITS + (ctx->strict?16:32));

    if (rc != Z_OK) {
    ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,
    @@ -431,10 +480,19 @@
    rc = inflate(&ctx->zstream, ctx->flush);

    if (rc != Z_OK && rc != Z_STREAM_END && rc != Z_BUF_ERROR) {

  •    ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,
    
  •                  "inflate() failed: %d, %d", ctx->flush, rc);
    
  •    return NGX_ERROR;
    
  •    if (ctx->strict==0 && ctx->first==1) {
    
  •  ctx->out=ctx->_in;
    
  •  ctx->done=1;
    
  •        ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,
    
  •                      "passing raw content, inflate() failed: %d, 
    

%d", ctx->flush, rc);

  •  return NGX_OK;
    
  •    } else {
    
  •        ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,
    
  •                      "inflate() failed: %d, %d", ctx->flush, rc);
    
  •        return NGX_ERROR;
    
  • }
    }

  • ctx->first=0;

    ngx_log_debug5(NGX_LOG_DEBUG_HTTP, r->connection->log, 0,
    “inflate out: ni:%p no:%p ai:%ud ao:%ud rc:%d”,
    @@ -653,6 +711,7 @@
    */

    conf->enable = NGX_CONF_UNSET;

  • conf->strict = NGX_CONF_UNSET;

    return conf;
    }
    @@ -665,10 +724,29 @@
    ngx_http_gunzip_conf_t *conf = child;

    ngx_conf_merge_value(conf->enable, prev->enable, 0);

  • ngx_conf_merge_value(conf->strict, prev->strict, 1);

    ngx_conf_merge_bufs_value(conf->bufs, prev->bufs,
    (128 * 1024) / ngx_pagesize,
    ngx_pagesize);

  • ngx_conf_merge_bitmask_value(conf->pass, prev->pass,

  •                          NGX_CONF_BITMASK_SET);
    

+#if (nginx_version < 8029)

  • if (ngx_http_merge_types(cf, conf->types_keys, &conf->types,
  •                         prev->types_keys, &prev->types,
    
  •                         ngx_http_html_default_types)
    
  •    != NGX_OK)
    

+#else

  • if (ngx_http_merge_types(cf, &conf->types_keys, &conf->types,
  •                         &prev->types_keys, &prev->types,
    
  •                         ngx_http_html_default_types)
    
  •    != NGX_OK)
    

+#endif

  • {
  •    return NGX_CONF_ERROR;
    
  • }
  • return NGX_CONF_OK;
    }

Posted at Nginx Forum:

Hi Max,

what do you think about committing this changes into mod_gunzip?

Posted at Nginx Forum:

Another question,
regarding checking status of upstream.
What do you think if change functionality for gunzip to have ability
write config like this:
if ( $upstream_status = 200 ) {
gunzip on;
gunzip_options any;
}

?

Posted at Nginx Forum: