Nginx with custom modules crashes in gzip crc32()

I have nginx with my custom module, that rewrite content on some
conditions. Without it everything works fine, but after enabling it,
nginx start to crash approx. every 2 hours ( > 100 req./sec) coredump
shows that it crahses in gzip module:

Core was generated by `nginx: worker process '.
Program terminated with signal 11, Segmentation fault.
#0 0x000000343ea0286d in crc32 () from /usr/lib64/libz.so.1
(gdb) bt
#0 0x000000343ea0286d in crc32 () from /usr/lib64/libz.so.1
#1 0x0000000000450f3f in ngx_http_gzip_filter_add_data (r=0x601f380,
in=0x29ac280) at src/http/modules/ngx_http_gzip_filter_module.c:708
#2 ngx_http_gzip_body_filter (r=0x601f380, in=0x29ac280) at
src/http/modules/ngx_http_gzip_filter_module.c:394
#3 0x0000000000451ac5 in ngx_http_postpone_filter (r=0x601f380,
in=0x29ac280) at src/http/ngx_http_postpone_filter_module.c:82
#4 0x00000000004521b1 in ngx_http_ssi_body_filter (r=0x343ea0c9c0,
in=0x7906442d) at src/http/modules/ngx_http_ssi_filter_module.c:392
#5 0x00000000004564b5 in ngx_http_charset_body_filter (r=0x343ea0c9c0,
in=0x61c0ff8) at src/http/modules/ngx_http_charset_filter_module.c:552
#6 0x0000000000457a5c in ngx_http_sub_body_filter (r=0x343ea0c9c0,
in=0x29ac280) at src/http/modules/ngx_http_sub_filter_module.c:188
#7 0x0000000000470e8f in ngx_http_af_filter (r=0x601f380, in=0x29ac280)
at /usr/src/redhat/SOURCES/af-headers/ngx_af_headers_module.c:768
#8 0x00000000004793e6 in clweb_c_body_filter (r=0x601f380,
in=0x29ac280) at
/usr/src/redhat/SOURCES/content-parser-module/ngx_mod_content_parser.c:510
#9 0x0000000000479d28 in ngx_http_gunzip_body_filter (r=0x601f380,
in=0x29ac280) at
/usr/src/redhat/SOURCES/gunzip/ngx_http_gunzip_filter_module.c:323
#10 0x000000000047ca5d in ngx_subr_body_filter (r=0x601f380,
in=0x1f5f830) at
/usr/src/redhat/SOURCES/ngx_subr_module/ngx_subr_module.c:219
#11 0x000000000047d46d in ngx_subr_body_filter (r=0x601f380,
in=0x1f5f830) at
/usr/src/redhat/SOURCES/ngx_subr_all_module/ngx_subr_all_module.c:219
#12 0x000000000040ba99 in ngx_output_chain (ctx=0x5e958c0, in=0x61c0ff8)
at src/core/ngx_output_chain.c:65
#13 0x000000000043bbf7 in ngx_http_copy_filter (r=0x601f380,
in=0x1f5f830) at src/http/ngx_http_copy_filter_module.c:141
#14 0x000000000044bdd1 in ngx_http_range_body_filter (r=0x343ea0c9c0,
in=0x61c0ff8) at src/http/modules/ngx_http_range_filter_module.c:551
#15 0x000000000042ed92 in ngx_http_output_filter (r=0x601f380,
in=0x1f5f830) at src/http/ngx_http_core_module.c:1868
#16 0x00000000004463fa in ngx_http_upstream_process_non_buffered_request
(r=0x601f380, do_write=) at
src/http/ngx_http_upstream.c:2381
#17 0x00000000004468fc in
ngx_http_upstream_process_non_buffered_upstream (r=0x601f380,
u=0x6096858) at src/http/ngx_http_upstream.c:2352
#18 0x00000000004457f6 in ngx_http_upstream_handler (ev=0x148e568) at
src/http/ngx_http_upstream.c:917
#19 0x00000000004269be in ngx_epoll_process_events (cycle=0x796480,
timer=, flags=) at
src/event/modules/ngx_epoll_module.c:635
#20 0x000000000041e3c8 in ngx_process_events_and_timers (cycle=0x796480)
at src/event/ngx_event.c:245
#21 0x0000000000425203 in ngx_worker_process_cycle (cycle=0x796480,
data=) at src/os/unix/ngx_process_cycle.c:800
#22 0x0000000000423967 in ngx_spawn_process (cycle=0x796480,
proc=0x425118 <ngx_worker_process_cycle>, data=0x0, name=0x4d01ce
“worker process”, respawn=-3) at src/os/unix/ngx_process.c:196
#23 0x000000000042478c in ngx_start_worker_processes (cycle=0x796480,
n=12, type=-3) at src/os/unix/ngx_process_cycle.c:360
#24 0x0000000000425964 in ngx_master_process_cycle (cycle=0x796480) at
src/os/unix/ngx_process_cycle.c:136
#25 0x0000000000408dba in main (argc=22, argv=0x795060) at
src/core/nginx.c:405

dmesg says:
nginx[20500]: segfault at 61c1000 ip 000000343ea0286d sp
00007fff348eec78 error 4 in libz.so.1.2.3[343ea00000+14000]

so libz is going to read smth at bad address.
This is definetly not nginx bug, but it is not simple to debug nginx, so
I will be happy for any advice, what can cause nginx to crash? May be
its stack problems?

Server info: Centos linux 5.6, amd64, nginx 1.0.6, zlib-devel-1.2.3-3

Posted at Nginx Forum:

On Mon, Oct 17, 2011 at 12:27:19AM -0400, artemg wrote:

#1 0x0000000000450f3f in ngx_http_gzip_filter_add_data (r=0x601f380,
in=0x29ac280) at src/http/modules/ngx_http_sub_filter_module.c:188
/usr/src/redhat/SOURCES/ngx_subr_module/ngx_subr_module.c:219
in=0x1f5f830) at src/http/ngx_http_core_module.c:1868
src/event/modules/ngx_epoll_module.c:635
src/os/unix/ngx_process_cycle.c:136
its stack problems?

Server info: Centos linux 5.6, amd64, nginx 1.0.6, zlib-devel-1.2.3-3

It seems your module errorneously overrides memory area that corresponds
to ctx->zstream.next_in or ctx->zstream.avail_in values:

    ctx->crc32 = crc32(ctx->crc32, ctx->zstream.next_in,
                       ctx->zstream.avail_in);


Igor S.

Thanks for the answer, but its difficult to understand how can that
happen. I even compare output of modified (by my module) response and
unmodified - they are the same, with the only difference that data in
chains is filled in another way. I.e. input data lengths are 100-200-100
bytes (.last - .pos) and after my content rewriting module they can be
0-0-400, or even 0-0-0 chain, and data can be added in next chain. I am
doing buffering to match some patterns. In fact of this I see in error
log:

[alert] 31974#0: *98657 zero size buf in writer t:1 r:1 f:0
0000000001714FE0 0000000001F8D290-0000000001F8D290 0000000000000000 0-0
while sending to client

Is that ok, to have zero size bufs, or I need to modify the chain, not
to pass them forward to other modules?

Posted at Nginx Forum:

agentzh, thanks for the answer, seems the problem is really with zero
size bufs. I changed code to insert spaces, if buffer is empty and
everything start to work with gzip disabled (before that nothing worked
with gzip off), and I think there will be no crashes in gzip now if
enabled. Now I will create my own chains to pass to downstream filters.

I didn’t try valgrind because usually it consumes a lot more cpu, and
this is unacceptable on staging machine(there will be high cpu load, or
I will need to decrease number of requests so I will wait for crash more
time). But thanks for reminding about it.

Posted at Nginx Forum:

On Wed, Oct 19, 2011 at 8:20 AM, artemg [email protected] wrote:

Passing zero-size non-special bufs to the downstream output filters is
surely BAD. You have to fix it :slight_smile:

Also, using tools like valgrind’s memcheck to find memory issues in
your modules is highly recommended and can often save you huge number
of hours of debugging :slight_smile:

Regards,
-agentzh

On Thu, Oct 20, 2011 at 7:17 PM, artemg [email protected] wrote:

By the way passing zero size last_buf is ok, as I understand? What do
you mean by “non-special bufs” ?

A buf with last_buf set is “special”. Check out the ngx_buf_special
macro definition in nginx core’s src/core/ngx_buf.h:

#define ngx_buf_special(b)
      \
    ((b->flush || b->last_buf || b->sync)
      \
     && !ngx_buf_in_memory(b) && !b->in_file)

Regards,
-agentzh

By the way passing zero size last_buf is ok, as I understand? What do
you mean by “non-special bufs” ?

Posted at Nginx Forum: