Have a machine running the latest devel version nginx-0.7.33 (tried 0.6.35 with same results) for serving small (less than 10K images) and am seeing on tcpdump that some SYN packets are not responded to right away.The browser does retransmit these image requests every second and on the 2nd or 3rd resent SYN, I finally start seeing and ACK, and the images load. It is very indeterministic as to when it happens and can only reproduce it some of the time. When it does occur the outcome is a page with some images loaded and others (whose SYN packets are not ACKs) are not loaded.....a few seconds later they load. Typically the system has ~2000 active connections, most in keep alive. The load is around 100-200 req/sec. I have tries all sorts of settings and configurations suggested in the maillist but I still dont have the solution for this issue. from 1 to 4 workers, changing the connection counts, different even handlers, kernel buffers, etc. It just seems so anecdotal to just change a bunch of settings without being able to what is happening internally. I'd like to be able to debug a little deeper to find out what is happening to these packets. How would I go about debugging what is the cause of it. Is it the interface driver, kernel, or nginx? What kind of tools and debugging options can I try next? Thanks, Radek System Details model name : Intel(R) Xeon(R) CPU X3210 @ 2.13GHz Linux n06 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 12:03:43 EST 2008 i686 i686 i386 GNU/Linux eth0 Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Current message level: 0x000000ff (255) Link detected: yes driver: tg3 version: 3.86 firmware-version: 5721-v3.61, ASFIPMI v6.21 bus-info: 0000:05:00.0 avg-cpu: %user %nice %system %iowait %steal %idle 0.10 0.00 0.20 2.43 0.00 97.27 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 27.40 0.00 443.20 0.00 16.18 0.10 3.50 3.32 9.10 no overruns or errors on interface.
on 2009-02-04 08:15
on 2009-02-04 08:21
On Tue, Feb 03, 2009 at 10:07:27PM -0800, Radek Burkat wrote: > load is around 100-200 req/sec. > How would I go about debugging what is the cause of it. Is it the interface > driver, kernel, or nginx? What kind of tools and debugging options can I > try next? nginx has not any relation to TCP handshake (SYN, SYN/ACK, ACK), this is kernel or driver issue.
on 2009-02-04 10:13
Thanks Igor, for eliminating one variable for me. I noticed that when I use a single worker and set the affinity to the same core as the ethernet interface interrupt I decrease the frequency of the issue but do not eliminate it. Not sure Are there any kernel/network debugging tools which reach a little deeper than tcpdump? Radek
on 2009-02-04 10:37
Hello, have you tried to disable syncookies and/or adjust them ? On the varnish documentation ( http://varnish.projects.linpro.no/wiki/Performance ) there is at least this setup (in /etc/sysctl.conf) : net.ipv4.tcp_syncookies = 0 net.ipv4.tcp_max_orphans = 262144 net.ipv4.tcp_max_syn_backlog = 262144 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 and what about netfilter connection tracking ? Olivier Radek Burkat a écrit :
on 2009-02-10 00:05
I have been looking at this issue to try to understand the interactions and this is what I have found out so far. I can reproduce the issue by decreasing the net.ipv4.tcp_max_syn_backlog and setting net.ipv4.tcp_syncookies = 0. It makes sense that when the load gets a little high the system does not process the accept fast enough and the backlog builds up. When the backlog fills up the ACK to the initial SYN is not issued. It would seem also that enabling syncookies uses some other mechanism as I do not see the dropped connections when I have net.ipv4.tcp_syncookies = 1 and back log to some small value like net.ipv4.tcp_max_syn_backlog =10. From what I read so far is that syncookies goes into action when you hit a certain high water mark on your backlog buffer. Raising the backlog value and/or enabling syncookies seems like a good idea. You can look at how many connection are in the backlog with net stat. netstat -aln | grep -c SYN_RECV 146 The problem here is that this gives you and instantaneous count of the backlog. As you can imagine this backlog is getting filled and cleared at a high rate. It would be nice to know when this actually gets full or even what the max has been over a period of time. Is there a stat like this in net statistics? This information seems like it is quite important in determining the performance of your server latency. In nginx there is the optional backlog variable in the listen configuration. I imagine that this maps directly to the system call listen. Though I see that the nginx backlog default is -1. How does that translate to the system parameter which requires an in above 0. Radek
on 2009-02-10 09:19
On Mon, Feb 09, 2009 at 01:51:12PM -0800, Radek Burkat wrote: > From what I read so far is that syncookies goes into action when you hit a > This information seems like it is quite important in determining the > performance of your server latency. > > In nginx there is the optional backlog variable in the listen configuration. > I imagine that this maps directly to the system call listen. Though I see > that the nginx backlog default is -1. How does that translate to the system > parameter which requires an in above 0. Yes, backlog may affect this. However, it seems that you run very old nginx as it had been fixed in 2007: Changes with nginx 0.6.7 15 Aug 2007 Changes with nginx 0.5.32 24 Sep 2007 *) Bugfix: now nginx uses default listen backlog value 511 on all platforms except FreeBSD. Thanks to Jiang Hong. Now -1 is used on FreeBSD only and it means the maximum available backlog. Besides, net.ipv4.tcp_max_syn_backlog = 10 is too small. Also, when kernel has no enough backlog, it usually response with RST packet, if this is not disabled.
on 2009-02-10 10:23
I actually tried the latest 0.6x and .7x. The value of 10 was just for testing and reproducing what I was seeing. I had the system backlog at 1024 so I must have been hitting the 511 of nginx. I verified at least on rel5/centos5 that when the backlog is reached, no RST packet is sent. The SYN is just dropped and ignored. This is also consistent with the documentation of backlog behavior from Stevens, Unix Network Programming. One of the things that the documentation suggests, but is very vague about, is that the backlog count may not only be connections that are in the SYN_RECV state but also connections that are in the soft ESTABLISHED phase. That is server sent out the syn/ack but has not yet received the ack/first data from the client. I'll try to verify this. It would seem that on even a moderately busy server serving pages of images it is very easy to hit the 511 backlog. Even though my server can saturate a gigabit at 16k req/s on ab, i hit this backlog limit pretty regularly on conditions where the average is 200 req/s. As I explained earlier, when this happens your image is delayed by at least a second until the browser re requests. Worst of all there are no errors, or statistics that would suggest this is happening. Of course I can just increase the backlog but at this point might as well try to understand this mechanism and figure out how to detect it in the future. I'll investigate some more and post here in the effort that this may help others with debugging similar issues. Radek
on 2009-02-10 10:37
On Tue, Feb 10, 2009 at 12:09:50AM -0800, Radek Burkat wrote: > I actually tried the latest 0.6x and .7x. The value of 10 was just for Oh, -1 should be fixed in wiki. > testing and reproducing what I was seeing. I had the system backlog at > 1024 so I must have been hitting the 511 of nginx. I verified at least on > rel5/centos5 that when the backlog is reached, no RST packet is sent. The > SYN is just dropped and ignored. This is also consistent with the > documentation of backlog behavior from Stevens, Unix Network Programming. > > One of the things that the documentation suggests, but is very vague about, > is that the backlog count may not only be connections that are in the > SYN_RECV state but also connections that are in the soft ESTABLISHED phase. Yes, backlog is also for ESTABLISHED, but not accept()ed sockets. However, in FreeBSD backlog does not account sockets in syncache (SYN_RECV state). I do not know about Linux.