Hello!
On Tue, Aug 20, 2013 at 09:49:32AM +0200, Philip Hofstetter wrote:
Ok. I have three debug logs now:
http://www.gnegg.ch/debug-cancel.log
is the first log I created where I quit curl once nginx has logged a
200 status with a truncated length to the access log (how can it log
success while it’s still transmitting data?)
A http status code nginx logs to access log corresponds to the
code sent to a client. As the code was already sent at the time
the problem was detected - it’s 200.
To add a further note: The machine which shows this issue is under
considerable load. When I run the tests against and identical machine
which is not under load, the download runs correctly (until I do put
it under load at which point it fails the same way).
The fact that nginx logs the request as successful (but truncated)
while it’s still ongoing does kinda point to a kernel issue, but I’m
really just guessing at this point.
Both full logs show that nothing happens in 60 seconds (while there are
unsent data pending):
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http output filter
“/index.php/winclient/gnegg?”
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http copy filter:
“/index.php/winclient/gnegg?”
2013/08/20 09:33:31 [debug] 1692#0: *1101651 image filter
2013/08/20 09:33:31 [debug] 1692#0: *1101651 xslt filter body
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http postpone filter
“/index.php/winclient/gnegg?” 00000000022A7218
2013/08/20 09:33:31 [debug] 1692#0: *1101651 write new buf t:0 f:0
0000000000000000, pos 000000000231CAF0, size: 4096 file: 0, size: 0
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http write filter: l:0 f:1
s:4096
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http write filter limit 0
2013/08/20 09:33:31 [debug] 1692#0: *1101651 writev: 1953
Note: only 1953 of 4096 bytes were sent.
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http write filter
00000000022A7228
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http copy filter: -2
“/index.php/winclient/gnegg?”
2013/08/20 09:33:31 [debug] 1692#0: *1101651 event timer del: 141:
1376984038781
2013/08/20 09:33:31 [debug] 1692#0: *1101651 event timer add: 141:
60000:1376984071388
Note: timer was set to timeout after 60 seconds.
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream request:
“/index.php/winclient/gnegg?”
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream process non
buffered upstream
2013/08/20 09:33:31 [debug] 1692#0: *1101651 event timer: 141, old:
1376984071388, new: 1376984071390
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream request:
“/index.php/winclient/gnegg?”
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream dummy handler
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream request:
“/index.php/winclient/gnegg?”
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream process non
buffered upstream
2013/08/20 09:33:31 [debug] 1692#0: *1101651 event timer: 141, old:
1376984071388, new: 1376984071645
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream request:
“/index.php/winclient/gnegg?”
2013/08/20 09:33:31 [debug] 1692#0: *1101651 http upstream dummy handler
2013/08/20 09:34:31 [debug] 1692#0: *1101651 event timer del: 141:
1376984071388
2013/08/20 09:34:31 [debug] 1692#0: *1101651 http run request:
“/index.php/winclient/gnegg?”
2013/08/20 09:34:31 [debug] 1692#0: *1101651 http upstream process non
buffered downstream
2013/08/20 09:34:31 [info] 1692#0: *1101651 client timed out (110:
Connection timed out) while sending to client, client: 80.219.149.116,
server: , request: “POST /index.php/winclient/gnegg HTTP/1.0”, upstream:
“http://127.0.0.1:8081/index.php/winclient/gnegg”, host:
“REDACTED.popscan.ch”
2013/08/20 09:34:31 [debug] 1692#0: *1101651 finalize http upstream
request: 408
After a 60 seconds timer was fired and client connection was
closed as timed out.
That is, from nginx point of view everything looks like a real
timeout.
Unfortunately, with location-level debug logs it’s not possible to
see event handling details (and that’s why it’s generally
recommended to activate debug log at global level, BTW). But I
would suppose everything is fine there as well, and the problem is
actually a result of kernel’s behaviour.
–
Maxim D.
http://nginx.org/en/donation.html