Inconsistent latency to sub_status

Hi,

We use nginx in a latency sensitive application on our own hardware. We
have a monitoring system running on the local network that pings the
nginx_status page from the sub_status module once per minute. In
addition to collecting the stats from the page, it also measures the
round trip latency. Latency is often sub 1ms, but sometimes spikes to
over 30ms or more and can stay there for up to 5 consecutive minutes. I
am trying to identify the source of the latency jitter, if possible.

We run ubuntu 10.04, which has a 100Hz system timer, so I expect that
consistently getting below 10ms response is unlikely. Spiking to 30ms,
however, feels like there is a config issue somewhere - whether system
or nginx. Nginx is a front end for us; all requests are passed to a
FCGI backend via unix sockets (no static file serving).

This is an 8 physical core box with hyperthreading turned on. It isn’t
swapping, has very low disk activity, no TCP/IP issues that I can tell.
CPU use ranges from 20 - 40% across all cores (the application and nginx
are co-resident). The box handles between 300 and 1500 requests/sec. I
have not noticed any correlation between higher overall load and higher
ping latency; it seems to be pretty random.

Any suggestions on other things to investigate are appreciated.

Config details follow.

Cheers,

Dean

We are running Nginx 1.0.5

nginx.conf (excerpt):
worker_processes 8;
worker_rlimit_nofile 16384;
worker_priority 0;

events {
multi_accept off;
accept_mutex off;
worker_connections 8192;
}

http {
error_log /var/log/nginx/error.log info buffer=32k;
access_log /var/log/nginx/access.log main buffer=128k;
sendfile off;
tcp_nodelay on;
}

server {
listen :80 default_server backlog=1024;
}

sysctl.conf (excerpt):

vm.swappiness=0
vm.vfs_cache_pressure = 50
fs.file-max = 65535
net.ipv4.ip_local_port_range = 2000 65000
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.core.netdev_max_backlog = 5000
net.core.somaxconn=4096
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_fin_timeout = 30
net.unix.max_dgram_qlen=1024

Posted at Nginx Forum:

I found the issue – this server handles a mix of SSL and regular
traffic. When initially configuring SSL, I used a 4096-bit CSR and did
not manage cypher selection carefully. As a result, nginx workers spent
a lot of time in OpenSSL, calls unable to service other requests. The
ratio of SSL to regular traffic made the ping latency erratic.

The fix was straightforward: use a smaller CSR and select
server-preferred cyphers that were less computationally intensive.

Cheers,

Dean

Posted at Nginx Forum: