Nginx sometimes does not process short preread body (ngx_upstream_process_body)

Hello,

We are using nginx/0.6.22 on Ubuntu hardy or intrepid, i386 or
amd64. nginx is in front of a mongrels rails app.

For one particular request, the response from the rails app is received
by nginx, with the header and the body together in one readv(2). However
it waits for 60s and then sends out only the header to the client,
without the body, but with the correct Content-Length, as if the body
were included!

This happens every time for this request but POSTs to other URLs
resulting in 302 work fine.

We have quite a complex config, and possibly that might be related. Any
suggestions would be very much appreciated!

Here is the strace (of nginx)

22:32:00.914649 write(3, “2008/09/21 22:32:00 [debug] 3432#0: epoll:
fd:6 ev:0005 d:0000000000709271\n”, 75) = 75 <0.000011>
22:32:00.914718 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
upstream process header\n”, 68) = 68 <0.000005>
22:32:00.914749 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 malloc:
00000000006C7930:4096\n”, 69) = 69 <0.000005>
22:32:00.914778 recvfrom(6, "HTTP/1.1 302 Moved
Temporarily\r\nConnection: close\r\nDate: Mon, 22 Sep 2008 08:31:57
GMT\r\nSet-Cookie: uid=884s; path=/\r\nSet-Cookie:
globalize_change_user_key=; path=/\r\nSet-Cookie: language_code=en;
path=/\r\nSet-Cookie: _iknow_sid=baa1d09712a041eca1edb7f6c273ea96;
path=/; expires=Mon, 06 Oct 2008 06:53:39 GMT\r\nStatus: 302
Found\r\nLocation: http://localhost/list/100005/build\r\nX-Runtime:
1.55355\r\nCache-Control: no-cache\r\nServer: Mongrel
1.0.1\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length:
100\r\n\r\nYou are being <a
href="http://localhost/list/100005/build\“>redirected</a>.</body></html>”,
4096, 0, NULL, NULL) = 604 <0.000006>

22:32:00.914845 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 recv:
fd:6 604 of 4096\n”, 62) = 62 <0.000005>
22:32:00.914875 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy status 302 "302 Moved Temporarily"\n”, 85) = 85 <0.000005>
22:32:00.914907 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Connection: close"\n”, 78) = 78 <0.000005>
22:32:00.914937 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Date: Mon, 22 Sep 2008 08:31:57 GMT"\n”, 96) = 96
<0.000019>
22:32:00.914983 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Set-Cookie: uid=884s; path=/"\n”, 89) = 89 <0.000005>
22:32:00.915015 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Set-Cookie: globalize_change_user_key=; path=/"\n”,
107) = 107 <0.000005>
22:32:00.915047 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Set-Cookie: language_code=en; path=/"\n”, 97) = 97
<0.000005>
22:32:00.915079 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Set-Cookie: _iknow_sid=baa1d09712a041eca1edb7f6c273ea96;
path=/; expires=Mon, 06 Oct 2008 06:53:39 GMT"\n”, 163) = 163
<0.000005>
22:32:00.915114 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Status: 302 Found"\n”, 78) = 78 <0.000005>
22:32:00.915144 write(3, "2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Location: http://localhost/list/100005/build\“\n”, 105)
= 105 <0.000005>
22:32:00.915176 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "X-Runtime: 1.55355"\n”, 79) = 79 <0.000005>
22:32:00.915206 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Cache-Control: no-cache"\n”, 84) = 84 <0.000005>
22:32:00.915236 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Server: Mongrel 1.0.1"\n”, 82) = 82 <0.000005>
22:32:00.915267 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Content-Type: text/html; charset=utf-8"\n”, 99) = 99
<0.000004>
22:32:00.915297 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header: "Content-Length: 100"\n”, 80) = 80 <0.000004>
22:32:00.915326 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
proxy header done\n”, 62) = 62 <0.000005>
22:32:00.915365 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 malloc:
0000000000763170:4096\n”, 69) = 69 <0.000005>
22:32:00.915396 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4
HTTP/1.1 302 Moved Temporarily\r\nServer: nginx/0.6.32\r\nDate: Mon, 22
Sep 2008 08:32:00 GMT\r\nContent-Type: text/html;
charset=utf-8\r\nConnection: keep-alive\r\nSet-Cookie: uid=884s;
path=/\r\nSet-Cookie: globalize_change_user_key=; path=/\r\nSet-Cookie:
language_code=en; path=/\r\nSet-Cookie:
_iknow_sid=baa1d09712a041eca1edb7f6c273ea96; path=/; expires=Mon, 06 Oct
2008 06:53:39 GMT\r\nStatus: 302 Found\r\nLocation:
http://localhost/list/100005/build\r\nX-Runtime:
1.55355\r\nCache-Control: no-cache\r\nContent-Length: 100\r\n\n\n”, 547)
= 547 <0.000005>
22:32:00.915456 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 write
new buf t:1 f:0 00000000007631B0, pos 00000000007631B0, size: 508 file:
0, size: 0\n”, 128) = 128 <0.000006>
22:32:00.915498 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
write filter: l:0 f:0 s:508\n”, 72) = 72 <0.000005>
22:32:00.915531 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
upstream process upstream\n”, 70) = 70 <0.000005>
22:32:00.915559 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
read upstream: 1\n”, 61) = 61 <0.000005>
22:32:00.915587 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
preread: 100\n”, 57) = 57 <0.000005>
22:32:00.915615 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 readv:
1:3492\n”, 53) = 53 <0.000005>
22:32:00.915641 readv(6, 0x7fff756d69b0, 1) = -1 EAGAIN (Resource
temporarily unavailable) <0.000005>
22:32:00.915669 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 readv()
not ready (11: Resource temporarily unavailable)\n”, 96) = 96 <0.000004>
22:32:00.915699 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
recv chain: -2\n”, 59) = 59 <0.000005>
22:32:00.915727 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
buf free s:0 t:1 f:0 00000000006C7930, pos 00000000006C7B28, size: 100
file: 0, size: 0\n”, 132) = 132 <0.000005>
22:32:00.915760 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
write downstream: 1\n”, 64) = 64 <0.000004>
22:32:00.915787 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
write busy: 0\n”, 58) = 58 <0.000005>
22:32:00.915815 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
write: out:0000000000000000, f:0\n”, 77) = 77 <0.000005>
22:32:00.915843 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
read upstream: 0\n”, 61) = 61 <0.000005>
22:32:00.915871 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 pipe
buf free s:0 t:1 f:0 00000000006C7930, pos 00000000006C7B28, size: 100
file: 0, size: 0\n”, 132) = 132 <0.000005>
22:32:00.915904 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 event
timer del: 6: 1222072377084\n”, 73) = 73 <0.000005>
22:32:00.915933 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 event
timer add: 6: 60000:1222072380914\n”, 79) = 79 <0.000005>
22:32:00.915963 write(3, “2008/09/21 22:32:00 [debug] 3432#0: *4 http
upstream dummy handler\n”, 67) = 67 <0.000005>
22:32:00.915991 write(3, “2008/09/21 22:32:00 [debug] 3432#0: timer
delta: 3830\n”, 54) = 54 <0.000005>
22:32:00.916018 write(3, “2008/09/21 22:32:00 [debug] 3432#0: posted
events 0000000000000000\n”, 67) = 67 <0.000005>
22:32:00.916046 write(3, “2008/09/21 22:32:00 [debug] 3432#0: worker
cycle\n”, 49) = 49 <0.000005>
22:32:00.916073 write(3, “2008/09/21 22:32:00 [debug] 3432#0: epoll
timer: 60000\n”, 55) = 55 <0.000005>

Why not respond now!?

22:32:00.916100 epoll_wait(8, {}, 512, 60000) = 0 <59.997826>
22:33:00.913995 write(3, “2008/09/21 22:33:00 [debug] 3432#0: timer
delta: 59999\n”, 55) = 55 <0.000010>
22:33:00.914046 write(3, “2008/09/21 22:33:00 [debug] 3432#0: posted
events 0000000000000000\n”, 67) = 67 <0.000005>
22:33:00.914076 write(3, “2008/09/21 22:33:00 [debug] 3432#0: worker
cycle\n”, 49) = 49 <0.000005>
22:33:00.914104 write(3, “2008/09/21 22:33:00 [debug] 3432#0: epoll
timer: 1\n”, 51) = 51 <0.000005>
22:33:00.914131 epoll_wait(8, {}, 512, 1) = 0 <0.010455>
22:33:00.924637 write(3, “2008/09/21 22:33:00 [debug] 3432#0: timer
delta: 11\n”, 52) = 52 <0.000008>
22:33:00.924693 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 event
timer del: 6: 1222072380914\n”, 73) = 73 <0.000005>
22:33:00.924723 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
upstream process upstream\n”, 70) = 70 <0.000005>
22:33:00.924760 write(3, “2008/09/21 22:33:00 [error] 3432#0: *4
upstream timed out (110: Connection timed out) while reading upstream,
client: 127.0.0.1, server: www.iknow.co.jp, request: "POST
/lists/new/theme HTTP/1.1", upstream:
"http://127.0.0.1:3000/lists/new/theme\”, host: "localhost",
referrer: "http://localhost/lists/new\“\n”, 306) = 306 <0.000005>
22:33:00.924806 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
upstream exit: 0000000000000000\n”, 76) = 76 <0.000027>
22:33:00.924858 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4
finalize http upstream request: 0\n”, 73) = 73 <0.000005>
22:33:00.924902 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4
finalize http proxy request\n”, 67) = 67 <0.000005>
22:33:00.924931 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 free rr
peer 1 0\n”, 56) = 56 <0.000005>
22:33:00.924959 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 close
http upstream connection: 6\n”, 73) = 73 <0.000005>
22:33:00.924987 close(6) = 0 <0.000034>
22:33:00.925040 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
upstream temp fd: -1\n”, 65) = 65 <0.000005>
22:33:00.925071 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
output filter "/lists/new/theme?"\n”, 78) = 78 <0.000005>
22:33:00.925101 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 copy
filter: "/lists/new/theme?"\n”, 72) = 72 <0.000005>
22:33:00.925132 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
postpone filter "/lists/new/theme?" 00007FFF756D6D20\n”, 97) = 97
<0.000005>
22:33:00.925163 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
postpone filter out "/lists/new/theme?"\n”, 84) = 84 <0.000005>
22:33:00.925195 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 write
old buf t:1 f:0 00000000007631B0, pos 00000000007631B0, size: 508 file:
0, size: 0\n”, 128) = 128 <0.000005>
22:33:00.925229 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 write
new buf t:0 f:0 0000000000000000, pos 0000000000000000, size: 0 file: 0,
size: 0\n”, 126) = 126 <0.000005>
22:33:00.925262 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
write filter: l:1 f:0 s:508\n”, 72) = 72 <0.000005>
22:33:00.925290 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
write filter limit 0\n”, 65) = 65 <0.000005>
22:33:00.925319 writev(10, [{“HTTP/1.1 302 Moved Temporarily\r\nServer:
nginx/0.6.32\r\nDate: Mon, 22 Sep 2008 08:32:00 GMT\r\nContent-Type:
text/html; charset=utf-8\r\nConnection: keep-alive\r\nSet-Cookie:
uid=884s; path=/\r\nSet-Cookie: globalize_change_user_key=;
path=/\r\nSet-Cookie: language_code=en; path=/\r\nSet-Cookie:
_iknow_sid=baa1d09712a041eca1edb7f6c273ea96; path=/; expires=Mon, 06 Oct
2008 06:53:39 GMT\r\nStatus: 302 Found\r\nLocation:
http://localhost/list/100005/build\r\nX-Runtime:
1.55355\r\nCache-Control: no-cache\r\nContent-Length: 100\r\n\r\n”,
508}], 1) = 508 <0.000100>

The body is not sent!

22:33:00.925480 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 writev:
508\n”, 51) = 51 <0.000004>
22:33:00.925510 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
write filter 0000000000000000\n”, 74) = 74 <0.000005>
22:33:00.925540 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 copy
filter: 0 "/lists/new/theme?"\n”, 74) = 74 <0.000005>
22:33:00.925569 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
finalize request: 0, "/lists/new/theme?"\n”, 85) = 85 <0.000005>
22:33:00.925600 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 set
http keepalive handler\n”, 66) = 66 <0.000005>
22:33:00.925629 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
close request\n”, 58) = 58 <0.000005>
22:33:00.925657 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
log handler\n”, 56) = 56 <0.000005>
22:33:00.925690 write(4, “127.0.0.1 - - [21/Sep/2008:22:33:00 -1000]
"POST /lists/new/theme HTTP/1.1" 302 0 "http://localhost/lists/new\
"Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.1) Gecko/2008072820
Ubuntu/8.10 (intrepid) Firefox/3.0.1" 63.844 "-" "localhost" "-"
[_iknow_sid=baa1d09712a041eca1edb7f6c273ea96;
Navicast_SessID=j8965.5ee1aa6e647d84.www2; language_code=en;
ysm_bbk1SGMN8HHSPL0MNUPLPM0FI4M7BS=13172008; uid=884s;
globalize_change_user_key=]\n", 438) = 438 <0.000009>
22:33:00.925749 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 free:
00000000006C7930\n”, 62) = 62 <0.000005>
22:33:00.925778 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 free:
00000000006C5910, unused: 0\n”, 73) = 73 <0.000005>
22:33:00.925807 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 free:
00000000006C6920, unused: 16\n”, 74) = 74 <0.000005>
22:33:00.925837 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 free:
0000000000763170, unused: 3049\n”, 76) = 76 <0.000005>
22:33:00.925874 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 event
timer add: 10: 75000:1222072455924\n”, 80) = 80 <0.000005>
22:33:00.925904 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 free:
00000000006CD330\n”, 62) = 62 <0.000005>
22:33:00.925933 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 free:
00000000006CCD50\n”, 62) = 62 <0.000005>
22:33:00.925961 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 hc
free: 0000000000000000 0\n”, 67) = 67 <0.000005>
22:33:00.925989 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 hc
busy: 0000000000000000 0\n”, 67) = 67 <0.000005>
22:33:00.926018 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 post
event 000000000072F2F0\n”, 67) = 67 <0.000005>
22:33:00.926047 write(3, “2008/09/21 22:33:00 [debug] 3432#0: posted
events 000000000072F2F0\n”, 67) = 67 <0.000004>
22:33:00.926075 write(3, “2008/09/21 22:33:00 [debug] 3432#0: posted
event 000000000072F2F0\n”, 66) = 66 <0.000005>
22:33:00.926104 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 delete
posted event 000000000072F2F0\n”, 76) = 76 <0.000005>
22:33:00.926133 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 http
keepalive handler\n”, 62) = 62 <0.000005>
22:33:00.926161 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 malloc:
00000000006CCD50:1024\n”, 69) = 69 <0.000005>
22:33:00.926191 recvfrom(10, 0x6ccd50, 1024, 0, 0, 0) = -1 EAGAIN
(Resource temporarily unavailable) <0.000003>
22:33:00.926215 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 recv:
fd:10 -1 of 1024\n”, 62) = 62 <0.000005>
22:33:00.926245 write(3, “2008/09/21 22:33:00 [debug] 3432#0: *4 recv()
not ready (11: Resource temporarily unavailable)\n”, 95) = 95 <0.000005>
22:33:00.926275 write(3, “2008/09/21 22:33:00 [debug] 3432#0: posted
event 0000000000000000\n”, 66) = 66 <0.000005>
22:33:00.926303 write(3, “2008/09/21 22:33:00 [debug] 3432#0: worker
cycle\n”, 49) = 49 <0.000005>
22:33:00.926331 write(3, “2008/09/21 22:33:00 [debug] 3432#0: epoll
timer: 75000\n”, 55) = 55 <0.000005>
22:33:00.926358 epoll_wait(8, {}, 512, 75000) = 0 <74.998182>
22:34:15.924609 write(3, “2008/09/21 22:34:15 [debug] 3432#0: timer
delta: 75000\n”, 55) = 55 <0.000010>
22:34:15.924661 write(3, “2008/09/21 22:34:15 [debug] 3432#0: *4 event
timer del: 10: 1222072455924\n”, 74) = 74 <0.000005>
22:34:15.924692 write(3, “2008/09/21 22:34:15 [debug] 3432#0: *4 http
keepalive handler\n”, 62) = 62 <0.000005>
22:34:15.924720 write(3, “2008/09/21 22:34:15 [debug] 3432#0: *4 close
http connection: 10\n”, 65) = 65 <0.000005>
22:34:15.924749 close(10) = 0 <0.000102>

John F. [email protected] writes:

resulting in 302 work fine.
The problem seems to be caused because the body gets stuck inside the
ngx_event_pipe. This patch fixes the problem. Any comments or
suggestions?

(The patch is copyright by Cerego and released under the same license as
nginx.)

Hello!

On Thu, Sep 25, 2008 at 12:56:02AM +0100, John F. wrote:

This happens every time for this request but POSTs to other URLs
resulting in 302 work fine.

The problem seems to be caused because the body gets stuck inside the
ngx_event_pipe. This patch fixes the problem. Any comments or
suggestions?

The patch looks wrong for me (not even mentioning that it’s
whitespace damaged). It destroys buffering for no real reason,
and not only for preread body, but for ordinary reads too.

From strace in your message, it looks like the real problems are:

  1. Backend, which hasn’t closed connection after sending reply.

  2. The fact that nginx waits for connection close before it
    realizes that request was completed, even if got ‘Content-Length’
    header and appropriate number of body bytes.

The 1 isn’t really related to nginx. The 2 should be fixed, but
a) the patch doesn’t do it (or at least doesn’t do it properly)
and b) it only matters with backend bug as in 1.

Could you please confirm that backend doesn’t close connection?

Maxim D.

(The patch is copyright by Cerego and released under the same license as
nginx.)

        size = cl->buf->end - cl->buf->last;
            n -= size;
  •  }
      }
    
      if (cl) {

Thanks for answering

On Thu, 25 Sep 2008 18:56:16 +0900 Maxim D. [email protected]
wrote:

Content-Length, as if the body were included!

This happens every time for this request but POSTs to other URLs
resulting in 302 work fine.

From strace in your message, it looks like the real problems are:

  1. Backend, which hasn’t closed connection after sending reply.

Yes, the backend does not close.

  1. The fact that nginx waits for connection close before it
    realizes that request was completed, even if got ‘Content-Length’
    header and appropriate number of body bytes.

And 3. nginx never forwards the body of the message after the time out,
(even though it has read it) sending only the header
with content-length!

The 1 isn’t really related to nginx. The 2 should be fixed, but
a) the patch doesn’t do it (or at least doesn’t do it properly)

It does fix the problem . . . Sorry for destroying buffering!

and b) it only matters with backend bug as in 1.

Could you please confirm that backend doesn’t close connection?

Yes, the backend does not close the connection.

Hello!

On Thu, Sep 25, 2008 at 09:25:08PM +0900, John F. wrote:

[…]

From strace in your message, it looks like the real problems are:

  1. Backend, which hasn’t closed connection after sending reply.

Yes, the backend does not close.

Ok, so I was right and backend isn’t http-complaint.

  1. The fact that nginx waits for connection close before it
    realizes that request was completed, even if got ‘Content-Length’
    header and appropriate number of body bytes.

And 3. nginx never forwards the body of the message after the time out,
(even though it has read it) sending only the header
with content-length!

Yep, thats another problem in nginx with handling such
buggy backends.

The 1 isn’t really related to nginx. The 2 should be fixed, but
a) the patch doesn’t do it (or at least doesn’t do it properly)

It does fix the problem . . . Sorry for destroying buffering!

The problem is that patch penalizes configurations with non-buggy
backends in an attempt to solve problem with buggy ones.

I’m not SPEWS^W^W not Igor, but it looks for me that the chances
the patch will be accepted is rather small.

On the other hand I’ve seen at least two similar reports in last
month or two, and it will be good for nginx to accept such backend
replies. And handling of this situation correctly is anyway
required to support persistent connections in future (if any).

I believe I will be able to reproduce the problem and I’ll try to
produce more correct patch as time permits.

Maxim D.

“Denis F. Latypoff” [email protected] writes:
[…]

Thursday, September 25, 2008, 9:40:56 PM, you wrote:
[…]

On Thu, Sep 25, 2008 at 09:25:08PM +0900, John F. wrote:

[…]

From strace in your message, it looks like the real problems are:

  1. Backend, which hasn’t closed connection after sending reply.

Yes, the backend does not close.

Ok, so I was right and backend isn’t http-complaint.

Sorry to be a pain, but I can’t find where this behaviour is specified.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8

        If either the client or the server sends the close token in
        the Connection header, that request becomes the last one for
        the connection.

Nothing about which side should close the connection?

I do agree that the mongrels backend is wrong to keep the connection
open as it is a waste of resources, but I can’t see where it is
specified.

[…]

Did you tried

proxy_buffering off;
keepalive_timeout 0;
gzip off;

for these locations?

Turning off the proxy_buffering makes the problem go away, IIRC. I am
not
sure if we also tried combinations of the other options.

[…]

I believe I will be able to reproduce the problem and I’ll try to
produce more correct patch as time permits.

Thanks!

[…]

Hello Maxim,

Thursday, September 25, 2008, 9:40:56 PM, you wrote:

Hello!

On Thu, Sep 25, 2008 at 09:25:08PM +0900, John F. wrote:

[…]

From strace in your message, it looks like the real problems are:

  1. Backend, which hasn’t closed connection after sending reply.

Yes, the backend does not close.

Ok, so I was right and backend isn’t http-complaint.

  1. The fact that nginx waits for connection close before it
    realizes that request was completed, even if got ‘Content-Length’
    header and appropriate number of body bytes.

And 3. nginx never forwards the body of the message after the time out,
(even though it has read it) sending only the header
with content-length!

Did you tried

proxy_buffering off;
keepalive_timeout 0;
gzip off;

for these locations?

Yep, thats another problem in nginx with handling such
buggy backends.

The 1 isn’t really related to nginx. The 2 should be fixed, but
a) the patch doesn’t do it (or at least doesn’t do it properly)

It does fix the problem . . . Sorry for destroying buffering!

The problem is that patch penalizes configurations with non-buggy
backends in an attempt to solve problem with buggy ones.

I’m not SPEWS^W^W not Igor, but it looks for me that the chances
the patch will be accepted is rather small.

On the other hand I’ve seen at least two similar reports in last
month or two, and it will be good for nginx to accept such backend
replies. And handling of this situation correctly is anyway
required to support persistent connections in future (if any).

yep, agreed 100%!

I believe I will be able to reproduce the problem and I’ll try to
produce more correct patch as time permits.

Maxim D.

__________ NOD32 3469 (20080924) Information __________

Hello!

On Thu, Sep 25, 2008 at 06:40:56PM +0400, Maxim D. wrote:

  1. The fact that nginx waits for connection close before it
    realizes that request was completed, even if got ‘Content-Length’
    header and appropriate number of body bytes.

And 3. nginx never forwards the body of the message after the time out,
(even though it has read it) sending only the header
with content-length!

Yep, thats another problem in nginx with handling such buggy backends.

[…]

I believe I will be able to reproduce the problem and I’ll try to
produce more correct patch as time permits.

I’ve finally build two patches for this, attached.

The patch-nginx-proxy-flush.txt patch adds flushing of partially
filled buffers in pipe if upstream times out. This makes nginx
replies as close as possible to upstream’s ones.

The patch-nginx-proxy-length.txt allows nginx to use
Content-Length it got from upstream as a hint that there is no
need to wait for upstream’s connection close.

Patches are independent and apply cleanly in any order.

Igor, could you please take a look?

Maxim D.

Hello!

On Sat, Sep 27, 2008 at 06:12:22PM +0100, John F. wrote:

       the connection.

Nothing about which side should close the connection?

I do agree that the mongrels backend is wrong to keep the connection
open as it is a waste of resources, but I can’t see where it is specified.

When talking to backends nginx uses HTTP/1.0. And RFC 1945 says
(1.3 Overall Operation):

% Except for experimental applications, current practice requires that
% the connection be established by the client prior to each request
and
% closed by the server after sending the response.

Maxim D.

p.s. The HTTP/1.1 spec basically ignores lower levels of
communication, pretending to be non-TCP/IP-bound protocol. This
somewhat make sense (SCTP comes to us, heh), but greatly reduce
it’s usability as a standard.

Hi Maxim,

Sorry for the delay

Maxim D. wrote:

And 3. nginx never forwards the body of the message after the time out,
I’m not SPEWS^W^W not Igor, but it looks for me that the chances the
I’ve finally build two patches for this, attached.

Igor, could you please take a look?

These patches definitely fix our problem. Thanks

To reproduce easily see this toy Ruby script webserver which
deliberately exposes the problem

require ‘socket’

server = TCPServer.new(“127.0.0.1”, 3000)

text = “All my base are belong to who?\n”

loop do
session = server.accept
request = session.gets

header = session.gets until header =~ /^\s*$/

puts “#{Time.now} #{session.peeraddr[3]} #{request}”

session.puts “HTTP/1.0 200 OK”
session.puts “Connection: close”
session.puts “Content-Length: #{text.size}”
session.puts
session.write text
session.flush
end

This server does not close the connection, only flushes. Without the
patch, nginx will timeout and never send the body. With the patch, it
works fine.

The real problem occurs sporadically because of the real mongrel
webserver’s borked HTTP handling. I appreciate that it is out of spec
but it would be nice if nginx handled this case.