Nginx reverse proxy proxies subset of requests slowly

(This is cross-post from Server Fault:
http://serverfault.com/questions/361742/nginx-reverse-proxy-proxies-subset-of-requests-slowly)

We’re running an nginx reverse proxy in front of a couple of IIS 7.5 web
servers. I’m benchmarking a particular page using Apache Bench. The page
is fully cached in memory in IIS (using ASP.NET outputcache). No caching
is configured for nginx.

We’ve noted a discrepancy in the benchmark results between runs straight
up against one IIS server (no reverse proxying) and with an nginx
reverse proxy in between. With the proxy in place, for non-trivial
loads, a subset of requests take very long to complete. Without the
proxy in place, all requests are completed in reasonably good time.

I’ve benchmarked using nginx running on machines both large and small
with the same result. I’m including Apache Bench output below, the first
listing was generated in a run straight against IIS, the second listing
was run with nginx in place. The nginx error-log shows nothing
untoward.

My question is whether anyone has clues as to what part of nginx or the
nginx-IIS interaction might cause this phenomenon or just ideas as to
where we should start looking for clues or whether it might just be a
benchmarking artifact.

IIS

user@host:~$ ab -n 5000 -c 1000 [IIS-Host]
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd,
http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking [IP] (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Completed 5000 requests
Finished 5000 requests

Server Software: Microsoft-IIS/7.5
Server Hostname: [IP]
Server Port: [port]

Document Path: [PATH]
Document Length: 37840 bytes

Concurrency Level: 1000
Time taken for tests: 12.592 seconds
Complete requests: 5000
Failed requests: 0
Write errors: 0
Total transferred: 190905000 bytes
HTML transferred: 189200000 bytes
Requests per second: 397.08 [#/sec] (mean)
Time per request: 2518.385 [ms] (mean)
Time per request: 2.518 [ms] (mean, across all concurrent
requests)
Transfer rate: 14805.57 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 12 17.9 3 62
Processing: 37 1944 1351.6 1548 7429
Waiting: 18 1522 751.9 1531 6248
Total: 68 1956 1343.9 1551 7432

Percentage of the requests served within a certain time (ms)
50% 1551
66% 1577
75% 1600
80% 1839
90% 4001
95% 5682
98% 6178
99% 6377
100% 7432 (longest request)
user@host:~$

nginx

user@host:~$ ab -n 5000 -c 1000 [NGINX-HOST]
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd,
http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking [HOST] (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Completed 5000 requests
Finished 5000 requests

Server Software: nginx/1.0.11
Server Hostname: [HOST]
Server Port: 80

Document Path: [PATH]
Document Length: 37840 bytes

Concurrency Level: 1000
Time taken for tests: 46.770 seconds
Complete requests: 5000
Failed requests: 0
Write errors: 0
Total transferred: 190490000 bytes
HTML transferred: 189200000 bytes
Requests per second: 106.91 [#/sec] (mean)
Time per request: 9353.928 [ms] (mean)
Time per request: 9.354 [ms] (mean, across all concurrent
requests)
Transfer rate: 3977.48 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 445 1130.0 6 7072
Processing: 14 3088 5987.3 825 43550
Waiting: 9 2541 6125.0 199 43541
Total: 18 3532 6064.6 1168 43554

Percentage of the requests served within a certain time (ms)
50% 1168
66% 2333
75% 3306
80% 3590
90% 14597
95% 19448
98% 23996
99% 25970
100% 43554 (longest request)
user@host:~$

Posted at Nginx Forum:

Hello!

On Mon, Feb 20, 2012 at 12:59:38PM -0500, anonymous_coward wrote:

reverse proxy in between. With the proxy in place, for non-trivial
loads, a subset of requests take very long to complete. Without the
proxy in place, all requests are completed in reasonably good time.

[…]

nginx

user@host:~$ ab -n 5000 -c 1000 [NGINX-HOST]

[…]

Concurrency Level: 1000

Under which OS you run nginx? Please note that 1000 is too high
for nginx on Windows, see known issues list here:

http://nginx.org/en/docs/windows.html#known_issues

If running nginx under Windows, please also make sure you have
worker_processes set to 1.

Maxim D.

Nginx is running on Linux (Ubuntu).

Posted at Nginx Forum:

Port range is net.ipv4.ip_local_port_range = 32768 61000, so I guess
that’s pretty standard and shouldn’t be a problem for this benchmarks
(which involves at most 500 requests).

I’ve tw_reuse and tw_recycle, but that does not seem to be having much
of an effect.

In rough numbers, what’s the expected per-core throughput of nginx in
reverse proxy mode? (with or without gzip’ing)

Posted at Nginx Forum:

I have now done exhaustive tests on a larger instance that yields higher
throughput (>1000 #/s). The /proc/sys/net/ipv4/tcp_syncookies setting
seems to have an effect on how much this subset of requests hang (it’s
worse with the setting enabled). I’m seeing “TCP: Possible SYN flooding
on port 80. Dropping request.” (or “Sending cookies”) in kern.log.

Posted at Nginx Forum:

Hello!

On Mon, Feb 20, 2012 at 02:39:14PM -0500, anonymous_coward wrote:

Nginx is running on Linux (Ubuntu).

Then you may want to check network problems (packet loss?) and
other related things like local ports exhaustion (local port
range small and no tw_reuse/tw_recycle activated?) and firewall
state table overflows.

Maxim D.