Nginx crash - possibly keepalives

Hi all,

After turning on keepalives, we’ve been seeing one crash pretty
consistently.
We’re running nginx 1.1.5 and here is the backtrace :

(gdb) bt
#0 ngx_write_chain_to_file (file=0x173dbd0, cl=0x0, offset=26779,
pool=0x162b850) at src/os/unix/ngx_files.c:162
#1 0x000000000040ea3e in ngx_write_chain_to_temp_file (tf=0x173dbd0,
chain=0x0) at src/core/ngx_file.c:35
#2 0x000000000041c534 in ngx_event_pipe_write_chain_to_temp_file
(p=0x173d048) at src/event/ngx_event_pipe.c:771
#3 0x000000000041cd3d in ngx_event_pipe_read_upstream (p=0x173d048,
do_write=0) at src/event/ngx_event_pipe.c:234
#4 ngx_event_pipe (p=0x173d048, do_write=0) at
src/event/ngx_event_pipe.c:50
#5 0x000000000043e1b6 in ngx_http_upstream_process_upstream
(r=0x1ae4bb0,
u=0x1afb4f8) at src/http/ngx_http_upstream.c:2704
#6 0x000000000043e612 in ngx_http_upstream_handler (ev=0x689b) at
src/http/ngx_http_upstream.c:936
#7 0x000000000041b1f6 in ngx_event_process_posted (cycle=, posted=0x0) at src/event/ngx_event_posted.c:39
#8 0x000000000041b0c6 in ngx_process_events_and_timers
(cycle=0x151bbf0) at
src/event/ngx_event.c:272
#9 0x0000000000420bc3 in ngx_worker_process_cycle (cycle=0x151bbf0,
data=) at src/os/unix/ngx_process_cycle.c:801
#10 0x000000000041f4cc in ngx_spawn_process (cycle=0x151bbf0,
proc=0x420aeb
<ngx_worker_process_cycle>, data=, name=0x470dc9
“worker process”, respawn=5) at src/os/unix/ngx_process.c:196
#11 0x0000000000421359 in ngx_reap_children (cycle=0x151bbf0) at
src/os/unix/ngx_process_cycle.c:617
#12 ngx_master_process_cycle (cycle=0x151bbf0) at
src/os/unix/ngx_process_cycle.c:180
#13 0x000000000040700a in main (argc=, argv=) at src/core/nginx.c:405

It seems that p->in is NULL in ngx_event_pipe_write_chain_to_temp_file()
but
I can’t figure out why.
It’s using the proxy module as an upstream.

Let me know I could provide more info to help!

Matthieu.

Hello!

On Tue, Oct 11, 2011 at 07:53:36PM -0700, Matthieu T. wrote:

chain=0x0) at src/core/ngx_file.c:35
#2 0x000000000041c534 in ngx_event_pipe_write_chain_to_temp_file
(p=0x173d048) at src/event/ngx_event_pipe.c:771

Do you have any local patches here or it’s just some debug logging
added?

$ grep -n ngx_write_chain_to_temp_file src/event/ngx_event_pipe.c
521: if (ngx_write_chain_to_temp_file(p->temp_file,
&file)
765: if (ngx_write_chain_to_temp_file(p->temp_file, out) ==
NGX_ERROR) {

#8 0x000000000041b0c6 in ngx_process_events_and_timers (cycle=0x151bbf0) at
#13 0x000000000040700a in main (argc=, argv=) at src/core/nginx.c:405

It seems that p->in is NULL in ngx_event_pipe_write_chain_to_temp_file() but
I can’t figure out why.
It’s using the proxy module as an upstream.

Let me know I could provide more info to help!

Could you please show:

fr 2
p *p
p p->bufs
p *p->temp_file

fr 5
p *r
p *u
p u->headers_in

I think I see what’s going on here, just to confirm. The
underlying issue seems to be one identified in commit message
here:

http://trac.nginx.org/nginx/changeset/4136/nginx

(all available buffers in the p->out chain)

Setting proxy_busy_buffers to something lower compared to
proxy_buffers may help as a workaround.

Maxim D.

Hello!

On Wed, Oct 12, 2011 at 09:32:41AM -0700, Matthieu T. wrote:

We’re running nginx 1.1.5 and here is the backtrace :
[…]

length = 168, … }
Ok, it looks like I’m right and all buffers are in busy chain.
Likely this happens due to upstream sending response in many small
chunks.

I’ll try to reproduce it here and provide proper fix.

[…]

We’ll try that and see if that has a positive effect, thank you.
Also, would having u->buffering == 0 (proxy_buffering off) help in this
scenario ?

Yes, switching off proxy_buffering should help. Switching off
disk buffering (proxy_cache off; proxy_max_temp_file_size 0;)
should help as well.

Maxim D.

Hello!

On Wed, Oct 12, 2011 at 09:22:41PM +0400, Maxim D. wrote:

Hi all,

After turning on keepalives, we’ve been seeing one crash pretty
consistently.
We’re running nginx 1.1.5 and here is the backtrace :

[…]

Ok, it looks like I’m right and all buffers are in busy chain.
Likely this happens due to upstream sending response in many small
chunks.

I’ll try to reproduce it here and provide proper fix.

Please try the attached patch.

Maxim D.

On Wed, Oct 12, 2011 at 5:20 AM, Maxim D. [email protected]
wrote:

(gdb) bt
#0 ngx_write_chain_to_file (file=0x173dbd0, cl=0x0, offset=26779,
pool=0x162b850) at src/os/unix/ngx_files.c:162
#1 0x000000000040ea3e in ngx_write_chain_to_temp_file (tf=0x173dbd0,
chain=0x0) at src/core/ngx_file.c:35
#2 0x000000000041c534 in ngx_event_pipe_write_chain_to_temp_file
(p=0x173d048) at src/event/ngx_event_pipe.c:771

Do you have any local patches here or it’s just some debug logging
added?

There is one patch isn’t in this code path and shouldn’t have any effect
here.

(r=0x1ae4bb0,

data=) at src/os/unix/ngx_process_cycle.c:801

p *p
p p->bufs
p *p->temp_file

fr 5
p *r
p *u
p u->headers_in

(gdb) fr 2
#2 0x000000000041c534 in ngx_event_pipe_write_chain_to_temp_file
(p=0x173d048) at src/event/ngx_event_pipe.c:771
771 src/event/ngx_event_pipe.c: No such file or directory.
in src/event/ngx_event_pipe.c
(gdb) p *p
$1 = {upstream = 0x7f3e2cbe1010, downstream = 0x7f3e2cbdd950,
free_raw_bufs
= 0x0, in = 0x0, last_in = 0x173d060, out = 0x0, last_out = 0x17b9bf8,
free
= 0x17b9b30,
busy = 0x17b9ec0, input_filter = 0x45c873
<ngx_http_proxy_chunked_filter>,
input_ctx = 0x1ae4bb0, output_filter = 0x42897f
<ngx_http_output_filter>,
output_ctx = 0x1ae4bb0,
read = 0, cacheable = 1, single_buf = 0, free_bufs = 1, upstream_done
= 0,
upstream_error = 0, upstream_eof = 0, upstream_blocked = 0,
downstream_done
= 0,
downstream_error = 0, cyclic_temp_file = 0, cache_file_maxed = 0,
allocated = 9, bufs = {num = 9, size = 8192}, tag = 0x693400, busy_size

8192, read_length = 27269,
length = 168, max_temp_file_size = 1048576, max_cache_file_size = 0,
temp_file_write_size = 16384, read_timeout = 120000, send_timeout =
60000,
send_lowat = 0,
pool = 0x162b850, log = 0x16530b0, preread_bufs = 0x0, preread_size =
12,
buf_to_file = 0x0, temp_file = 0x173dbd0, num = 61}
(gdb) p p->bufs
$2 = {num = 9, size = 8192}
(gdb) p *p->temp_file
$3 = {file = {fd = 308, name = {len = 46, data = 0x173dd88
“/usr/local/nginx-cache/storage/temp/0009520674”}, info = {st_dev = 0,
st_ino = 0, st_nlink = 0, st_mode = 0,
st_uid = 0, st_gid = 0, __pad0 = 0, st_rdev = 0, st_size = 0,
st_blksize = 0, st_blocks = 0, st_atim = {tv_sec = 0, tv_nsec = 0},
st_mtim
= {tv_sec = 0, tv_nsec = 0},
st_ctim = {tv_sec = 0, tv_nsec = 0}, __unused = {0, 0, 0}}, offset

26779, sys_offset = 26779, log = 0x16530b0, valid_info = 0, directio =
0},
offset = 26779,
path = 0x152efa0, pool = 0x162b850, warn = 0x0, access = 0, log_level
= 0,
persistent = 1, clean = 0}

(gdb) fr 5
#5 0x000000000043e1b6 in ngx_http_upstream_process_upstream
(r=0x1ae4bb0,
u=0x1afb4f8) at src/http/ngx_http_upstream.c:2704
2704 src/http/ngx_http_upstream.c: No such file or directory.
in src/http/ngx_http_upstream.c
(gdb) p *r
$4 = {signature = 1347703880, connection = 0x7f3e2cbdd950, ctx =
0x162bc60,
main_conf = 0x151c9e0, srv_conf = 0x1537c68, loc_conf = 0x157d2c0,
read_event_handler = 0x43e840
<ngx_http_upstream_rd_check_broken_connection>, write_event_handler =
0x43edf5 <ngx_http_upstream_process_downstream>, cache = 0x173d188,
upstream = 0x1afb4f8, upstream_states = 0x173d658, pool = 0x162b850,
header_in = 0x15bff00, headers_in = {headers = {last = 0x162c668, part =
{elts = 0x1afa9b0, nelts = 20,
next = 0x162c668}, size = 48, nalloc = 20, pool = 0x162b850},
host =
0x1afaa10, connection = 0x0, if_modified_since = 0x1afae30,
if_unmodified_since = 0x0,
user_agent = 0x1afae00, referer = 0x0, content_length = 0x0,
content_type = 0x0, range = 0x0, if_range = 0x0, transfer_encoding =
0x0,
expect = 0x0,
accept_encoding = 0x0, via = 0x0, authorization = 0x0, keep_alive =
0x0,
x_forwarded_for = 0x0, x_real_ip = 0x1afaa40, user = {len = 0, data =
0x0},
passwd = {len = 0,
data = 0x0}, cookies = {elts = 0x162c558, nelts = 0, size = 8,
nalloc
= 2, pool = 0x162b850}, server = {len = 13,
data = 0x162c57d
“sir-emeth.comx-real-ipconnect-via-httpscf-host-origin-ipcf-ipcountrycf-iptypecf-use-byccf-use-obcf-pref-obcf-pref-geoloccf-identify-botcf-cache-levelcf-force-miss-tscf-minify-modehost-spoofset-expires”…},
content_length_n = -1, keep_alive_n = -1, connection_type = 0, msie = 0,
msie6 = 0, opera = 0, gecko = 0, chrome = 0,
safari = 0, konqueror = 0}, headers_out = {headers = {last =
0x1ae4d58,
part = {elts = 0x162b8a0, nelts = 2, next = 0x0}, size = 48, nalloc =
20,
pool = 0x162b850},
status = 200, status_line = {len = 6, data = 0x1afb981 “200 OK0
\321s\001”}, server = 0x0, date = 0x0, content_length = 0x0,
content_encoding = 0x0, location = 0x0,
refresh = 0x0, last_modified = 0x0, content_range = 0x0,
accept_ranges =
0x0, www_authenticate = 0x0, expires = 0x0, etag = 0x0, override_charset

0x0,
content_type_len = 9, content_type = {len = 9, data = 0x173d9a8
“text/html”}, charset = {len = 0, data = 0x0}, content_type_lowcase =
0x0,
content_type_hash = 0,
cache_control = {elts = 0x0, nelts = 0, size = 0, nalloc = 0, pool =
0x0}, content_length_n = -1, date_time = 0, last_modified_time = -1},
request_body = 0x173d158,
lingering_time = 0, start_sec = 1318377836, start_msec = 445, method =
2,
http_version = 1001, request_line = {len = 116,
data = 0x1822f70 “GET
/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”}, uri = {len = 12,
data = 0x1afb392 “/@@se/bible/”}, args = {len = 95,
data = 0x1822f7c
“bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”}, exten = {len = 0, data = 0x0},
unparsed_uri = {len = 103, data = 0x1822f74
“/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”},
method_name = {len = 3,
data = 0x1822f70 “GET
/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”}, http_protocol = {
len = 8, data = 0x1822fdc “HTTP/1.1\r\nCF-IDTag”}, out = 0x17ba158,
main
= 0x1ae4bb0, parent = 0x0, postponed = 0x0, post_subrequest = 0x0,
posted_requests = 0x0,
virtual_names = 0x0, phase_handler = 11, content_handler = 0x45960e
<ngx_http_proxy_handler>, access_code = 0, variables = 0x162bdd8,
ncaptures
= 0, captures = 0x1afb130,
captures_data = 0x1822f74
“/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”, limit_rate = 0,
header_size = 271, request_length = 736, err_status = 0,
http_connection =
0x1653108, log_handler = 0x42f4b2 <ngx_http_log_error_handler>, cleanup

0x173d6b8,
subrequests = 51, count = 1, blocked = 0, aio = 0, http_state = 2,
complex_uri = 0, quoted_uri = 0, plus_in_uri = 0, space_in_uri = 0,
invalid_header = 0,
add_uri_to_alias = 0, valid_location = 1, valid_unparsed_uri = 0,
uri_changed = 0, uri_changes = 10, request_body_in_single_buf = 0,
request_body_in_file_only = 0,
request_body_in_persistent_file = 0, request_body_in_clean_file = 0,
request_body_file_group_access = 0, request_body_file_log_level = 5,
subrequest_in_memory = 0,
waited = 0, cached = 0, gzip_tested = 0, gzip_ok = 0, gzip_vary = 0,
gzip_force = 0, proxy = 0, bypass_cache = 0, no_cache = 0,
limit_zone_set =
0, limit_req_set = 0,
pipeline = 0, plain_http = 0, chunked = 1, header_only = 0, keepalive
= 1,
lingering_close = 0, discard_body = 0, internal = 1, error_page = 0,
ignore_content_encoding = 0,
filter_finalize = 0, post_action = 0, request_complete = 0,
request_output
= 1, header_sent = 1, expect_tested = 0, root_tested = 0, done = 0,
logged =
0, buffered = 0,
main_filter_need_in_memory = 0, filter_need_in_memory = 0,
filter_need_temporary = 0, allow_ranges = 0, stat_reading = 0,
stat_writing
= 1, state = 0, header_hash = 116750,
lowcase_index = 3, lowcase_header = “viansfer-encoding”, ‘\000’
<repeats
14 times>,
header_name_start = 0x1ae749c “33’>\n”, ’ ’ <repeats 12 times>,
“\n”,
’ ’ <repeats 12 times>, “\n”, ’ ’ <repeats 12
times>,
“\n”, ’ ’ <repeats 12 times>, “\n”, ’ ’
<repeats
12 times>, “\n”, ’ ’ <repeats 12 times>, “\r\n1a\r\n\n \r\n6\r\n \r\n9\r\n \n\r\n”…,
header_name_end = 0x1ae745c “‘20’ width=‘5’>\n”, ’ ’ <repeats 12
times>,
“\n”, ’ ’ <repeats 12 times>, “\n”, ’ ’
<repeats
12 times>, “\n”, ’ ’ <repeats 12 times>, “\n”, ’
’ <repeats 12 times>, “\n”, ’ ’ <repeats 12 times>, “\n”, ’ ’ <repeats 12 times>, "\n "…,
header_start = 0x1ae745e “0’ width=‘5’>\n”, ’ ’ <repeats 12 times>,
“\n”, ’ ’ <repeats 12 times>, “\n”, ’ ’
<repeats
12 times>, “\n”, ’ ’ <repeats 12 times>, “\n”, ’
’ <repeats 12 times>, “\n”, ’ ’ <repeats 12 times>, “\n”, ’ ’ <repeats 12 times>, "\n “…,
header_end = 0x1ae749c “33’>\n”, ’ ’ <repeats 12 times>, “\n”, ’

<repeats 12 times>, “\n”, ’ ’ <repeats 12 times>,
“\n”, ’ ’ <repeats 12 time—Type to continue, or q

to quit—
s>, “\n”, ’ ’ <repeats 12 times>, “\n”, ’ ’
<repeats 12 times>, “\r\n1a\r\n\n \r\n6\r\n
\r\n9\r\n \n\r\n”…,
uri_start = 0x1822f74
“/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”,
uri_end = 0x1822fdb " HTTP/1.1\r\nCF-IDTag”, uri_ext = 0x0,
args_start = 0x1822f7c
“bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”,
request_start = 0x1822f70 “GET
/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”,
request_end = 0x1822fe4 “\r\nCF-IDTag”,
method_end = 0x1822f72 “T
/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nCF-IDTag”, schema_start = 0x0,
schema_end = 0x0, host_start = 0x0, host_end = 0x0, port_start = 0x0,
port_end = 0x0, http_minor = 1, http_major = 1}
(gdb) p *u
$5 = {read_event_handler = 0x43e15b
<ngx_http_upstream_process_upstream>,
write_event_handler = 0x43b866 <ngx_http_upstream_dummy_handler>, peer =
{
connection = 0x7f3e2cbe1010, sockaddr = 0x1afb970, socklen = 16,
name =
0x173d730, tries = 1, get = 0x465c0a
<ngx_http_upstream_get_keepalive_peer>,
free = 0x465954 <ngx_http_upstream_free_keepalive_peer>, data =
0x173d780, set_session = 0x465262
<ngx_http_upstream_keepalive_set_session>,
save_session = 0x465275 <ngx_http_upstream_keepalive_save_session>,
local = 0x157cf00, rcvbuf = 0, log = 0x16530b0, cached = 0, log_error =
1},
pipe = 0x173d048,
request_bufs = 0x173d648, output = {buf = 0x0, in = 0x0, free = 0x0,
busy
= 0x0, sendfile = 1, directio = 0, unaligned = 0, need_in_memory = 0,
need_in_temp = 0,
alignment = 512, pool = 0x162b850, allocated = 0, bufs = {num = 1,
size
= 131072}, tag = 0x693400, output_filter = 0x4096f0 <ngx_chain_writer>,
filter_ctx = 0x1afb5f0},
writer = {out = 0x0, last = 0x1afb5f0, connection = 0x7f3e2cbe1010,
pool =
0x162b850, limit = 0}, conf = 0x157d858, headers_in = {headers = {last =
0x1afb628, part = {
elts = 0x173d7e0, nelts = 5, next = 0x0}, size = 48, nalloc = 8,
pool = 0x162b850}, status_n = 200, status_line = {len = 6, data =
0x1afb981
“200 OK0 \321s\001”},
status = 0x0, date = 0x173d7e0, server = 0x173d810, connection =
0x0,
expires = 0x0, etag = 0x0, x_accel_expires = 0x0, x_accel_redirect =
0x0,
x_accel_limit_rate = 0x0,
content_type = 0x173d840, content_length = 0x0, last_modified = 0x0,
location = 0x0, accept_ranges = 0x0, www_authenticate = 0x0,
transfer_encoding = 0x173d870,
content_encoding = 0x0, content_length_n = -1, cache_control = {elts

0x0, nelts = 0, size = 0, nalloc = 0, pool = 0x0}, connection_close = 0,
chunked = 1},
resolved = 0x173d000, buffer = {
pos = 0x1ae7330 “…”, last = 0x1ae7330 “…”,
file_pos = 0, file_last = 0,
start = 0x1ae7330 “…”, end = 0x1ae9330 “”, tag = 0x693400, file =
0x0,
shadow = 0x0, temporary = 1,
memory = 0, mmap = 0, recycled = 1, in_file = 0, flush = 0, sync =
0,
last_buf = 0, last_in_chain = 0, last_shadow = 0, temp_file = 0, num =
9},
length = -1,
out_bufs = 0x0, busy_bufs = 0x0, free_bufs = 0x0, input_filter_init =
0x4584f1 <ngx_http_proxy_input_filter_init>,
input_filter = 0x459db3 <ngx_http_proxy_non_buffered_chunked_filter>,
input_filter_ctx = 0x1ae4bb0, create_key = 0x45b21c
<ngx_http_proxy_create_key>,
max_cache_file_size = 0, create_request = 0x45a9bf
<ngx_http_proxy_create_request>, reinit_request = 0x458470
<ngx_http_proxy_reinit_request>,
process_header = 0x45a572 <ngx_http_proxy_process_header>,
abort_request =
0x45861a <ngx_http_proxy_abort_request>,
finalize_request = 0x45861c <ngx_http_proxy_finalize_request>,
rewrite_redirect = 0, timeout = 0, state = 0x173d680, method = {len = 0,
data = 0x0}, schema = {len = 7,
data = 0x1afb8f0
“http:///bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly”},
uri = {len = 103,
data = 0x173d4b4
"/bible/?bookSelect=2+Chronicles&chapterSelect=13&verseSelect=7&addedRadio=Brackets&vsRefRadio=VerseOnly
HTTP/1.1\r\nHost: sir-emeth.com\r\nConnection:
Keep-Alive\r\nAccept-Encoding: gzip\r\nCF-Connecting-IP: "…}, cleanup

0x173d6b8, bind_hash = 3776832724, store = 0, cacheable = 1, no_cache =
0,
accel = 1, ssl = 0,
cache_status = 1, buffering = 1, keepalive = 0, keepalive_status = 0,
request_sent = 1, header_sent = 1}
(gdb) p u->headers_in
$6 = {headers = {last = 0x1afb628, part = {elts = 0x173d7e0, nelts = 5,
next
= 0x0}, size = 48, nalloc = 8, pool = 0x162b850}, status_n = 200,
status_line = {len = 6,
data = 0x1afb981 “200 OK0 \321s\001”}, status = 0x0, date =
0x173d7e0,
server = 0x173d810, connection = 0x0, expires = 0x0, etag = 0x0,
x_accel_expires = 0x0,
x_accel_redirect = 0x0, x_accel_limit_rate = 0x0, content_type =
0x173d840, content_length = 0x0, last_modified = 0x0, location = 0x0,
accept_ranges = 0x0,
www_authenticate = 0x0, transfer_encoding = 0x173d870,
content_encoding =
0x0, content_length_n = -1, cache_control = {elts = 0x0, nelts = 0, size

0, nalloc = 0,
pool = 0x0}, connection_close = 0, chunked = 1}

Hope that this helps!

We’ll try that and see if that has a positive effect, thank you.
Also, would having u->buffering == 0 (proxy_buffering off) help in this
scenario ?

So in order to try to reproduce the problem locally, I made this debug
module :

This is meant to have 2 nginx chained one to the other.
nginx#1 has proxy_pass to nginx#2 while using keepalives
nginx#2 has mass_chunk on, which for this extreme test will chunk a 15M
file
into 10 bytes chunk (in different tcp packets or not).

I wasn’t able to reproduce the segfault, but when I’m using
proxy_buffering off it works fine.

But, when I’m using proxy_buffering and proxy_caching (with sendfile on,
and
directio 2M).
It seems like readv() returns 0, but finalize_request returns NGX_AGAIN.

I’m not sure that test is relevant, and what I’m really benching here.
And
if it might not be the tcp layer directly?
But maybe the failure should be explicit.

The behavior was the same with or without the patch. I’ve also tried
various
combination of proxy buffers, sendfile, directio.

Any thoughts ?

Thank you!
Matthieu.

On Wed, Oct 19, 2011 at 2:24 AM, Maxim D. [email protected]
wrote:

nginx#2 has mass_chunk on, which for this extreme test will chunk a 15M
done carefully to avoid making some proxy buffers busy here, as

The ngx_http_finalize_request() is void, it doesn’t return anything.

You may want to provide debug log to make it clear what are you
talking about.

I’m not exactly sure what part of it is relevant, so instead of posting
pages of log I provided steps to reproduce a configuration where nginx
shows
an erratic behavior, by not outputing the entire response while doing a
proxy_pass.

I’ll be experimenting more, and I’ll let you know if I find the origin
of
this issue.

M.

Hello!

On Tue, Oct 18, 2011 at 06:32:53PM -0700, Matthieu T. wrote:

proxy_buffering off it works fine.
To reproduce the segfault you have to:

  1. Fill client socket buffers to make sure no buffers will be
    freed at (2), i.e. all buffers will become busy. This should be
    done carefully to avoid making some proxy buffers busy here, as
    this is likely to make (2) impossible.

  2. Trigger “all buffers in the out chain” problem by sending multiple
    small chunks at once. Amount of data must be less than
    proxy_busy_buffers, but total input size (including chunked encoding
    overhead) must be big enough to consume all proxy_buffers
    (including first one, proxy_buffer_size).

  3. Send some more data to actually trigger segfault due to all
    buffers being busy.

I’m able to reproduce it here (using specially crafted backend
server and specially crafted client), and I’m not really intrested
in another artificial reproduction.

But, when I’m using proxy_buffering and proxy_caching (with sendfile on, and
directio 2M).
It seems like readv() returns 0, but finalize_request returns NGX_AGAIN.

ENOPARSE

The ngx_http_finalize_request() is void, it doesn’t return anything.

You may want to provide debug log to make it clear what are you
talking about.

Maxim D.

Hello!

On Wed, Oct 19, 2011 at 10:49:26AM -0700, Matthieu T. wrote:

On Wed, Oct 19, 2011 at 2:24 AM, Maxim D. [email protected] wrote:

Hello!

On Tue, Oct 18, 2011 at 06:32:53PM -0700, Matthieu T. wrote:

[…]

talking about.

I’m not exactly sure what part of it is relevant, so instead of posting
pages of log I provided steps to reproduce a configuration where nginx shows
an erratic behavior, by not outputing the entire response while doing a
proxy_pass.

So the issue you are seeing is truncated response, right?

I’m not able to reproduce it here (though I’ve tested under
FreeBSD, not Linux). Do you see sendfile and/or directio have any
influence?

And the point still remains: please provide debug log. Obviously
posting it here is a bad idea as it’s likely to be huge, though
making gzipped version available for download would be really
helpfull.

I’ll be experimenting more, and I’ll let you know if I find the origin of
this issue.

Please note that testing with your module may be problematic,
as there are at least the following issues with it:

  1. “mass_chunk_multi_packets on” expected to cause truncated
    responses as it aborts response on != NGX_OK from next body
    filter (e.g. on NGX_AGAIN if network isn’t as fast as reading from
    disk for some reason).

  2. checksums will not match as buffers from previous filters are
    marked as free before actual data are sent, thus causing mostly
    arbitrary data to be sent (again, if network isn’t as fast as
    reading from disk for some reason).

Maxim D.