Forum: NGINX High response time at high concurrent connections

2974d09ac2541e892966b762aad84943?d=identicon&s=25 laltin (Guest)
on 2013-09-30 00:50
(Received via mailing list)
I have a news site which is run by 4 tornado instances and nginx as
reverse
proxy in front of them.

Pages are rendered and cached in memcached so generally the response
time is
less than 3 ms according to tornado logs.

[I 130918 18:35:37 web:1462] 200 GET / (***.***.***.**) 2.43ms
[I 130918 18:35:37 web:1462] 200 GET / (***.***.***.**) 3.41ms
[I 130918 18:35:37 web:1462] 200 GET / (***.***.***.**) 1.96ms
[I 130918 18:35:37 web:1462] 200 GET / (***.***.***.**) 2.48ms
[I 130918 18:35:37 web:1462] 200 GET / (***.***.***.**) 4.09ms
[I 130918 18:35:37 web:1462] 200 GET / (***.***.***.**) 2.43ms
[I 130918 18:35:37 web:1462] 200 GET / (***.***.***.**) 2.49ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 2.25ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 2.39ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.93ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.70ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 2.08ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.72ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 2.02ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.70ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.74ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.85ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.60ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 1.83ms
[I 130918 18:35:38 web:1462] 200 GET / (***.***.***.**) 2.65ms

When I test this site with ab at concurrency level 1000 I get response
times
around 0.8 seconds. Here is the benchmark result:

Document Length:        12036 bytes

Concurrency Level:      1000
Time taken for tests:   7.974 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Keep-Alive requests:    10000
Total transferred:      122339941 bytes
HTML transferred:       120549941 bytes
Requests per second:    1254.07 [#/sec] (mean)
Time per request:       797.407 [ms] (mean)
Time per request:       0.797 [ms] (mean, across all concurrent
requests)
Transfer rate:          14982.65 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    7  20.8      0      86
Processing:    57  508 473.9    315    7014
Waiting:       57  508 473.9    315    7014
Total:        143  515 471.5    321    7014

Percentage of the requests served within a certain time (ms)
  50%    321
  66%    371
  75%    455
  80%    497
  90%   1306
  95%   1354
  98%   1405
  99%   3009
 100%   7014 (longest request)

I can handle ~1200 requests/seconds with 1000 concurrent connections and
when I do the same benchmark with 100 concurrent connections I can again
handle around 1200 requests/second but response time drops to ~80 ms.

When it comes to real life with 1000 concurrent connections users will
face
0.8 seconds response time which I think is a bad value.

My question is why response times increase when concurrency level is
increased?

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,243246,243246#msg-243246
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 Yichun Zhang (agentzh) (Guest)
on 2013-09-30 04:26
(Received via mailing list)
Hello!

On Sun, Sep 29, 2013 at 3:50 PM, laltin wrote:
>
> I can handle ~1200 requests/seconds with 1000 concurrent connections and
> when I do the same benchmark with 100 concurrent connections I can again
> handle around 1200 requests/second but response time drops to ~80 ms.
>

What you're seeing is a very common phenomenon.

Tools like ab always try to load the target server to its extreme
throughput, i.e., the 1200 req/sec you're seeing is already the
throughput limit for that service in your server. Because you're
already at the throughput limit, you have to sacrifice response
latency for higher concurrency level otherwise you'll just have much
higher throughput which is impossible (remember you're already on the
limit? ;)).

Regards,
-agentzh
56080a668b706e94fb2bbe9ee727d006?d=identicon&s=25 Reinis Rozitis (Guest)
on 2013-09-30 09:32
(Received via mailing list)
> My question is why response times increase when concurrency level is
> increased?

What are your worker_processes and worker_connections values?

p.s. also to be sure is the OS open files limit increased from the
typical
default of 1024

rr
2974d09ac2541e892966b762aad84943?d=identicon&s=25 laltin (Guest)
on 2013-09-30 16:25
(Received via mailing list)
But looking at tornado logs I expect around 2000reqs/sec. Assuming that
each
request is handled in 2ms one instance can handle 500reqs/sec and with 4
instances it sholud be 2000req/sec. But it is stuck at 1200reqs/sec I
wonder
why it is stuck at that point?
Does increasing the number of instances change the result?

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,243246,243255#msg-243255
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 Yichun Zhang (agentzh) (Guest)
on 2013-10-01 05:03
(Received via mailing list)
Hello!

On Mon, Sep 30, 2013 at 7:25 AM, laltin wrote:
> But looking at tornado logs I expect around 2000reqs/sec. Assuming that each
> request is handled in 2ms one instance can handle 500reqs/sec and with 4
> instances it sholud be 2000req/sec. But it is stuck at 1200reqs/sec I wonder
> why it is stuck at that point?
> Does increasing the number of instances change the result?
>

It depends on where the bottleneck really is :)

You can use the on-CPU flamegraph and off-CPU flamegraph tools to
check the on-CPU time and off-CPU time usage by sampling your related
processes under load. In particular, using the following tools on
Linux:

    https://github.com/agentzh/nginx-systemtap-toolkit...
    https://github.com/agentzh/nginx-systemtap-toolkit...

These tools are very general and not specific to Nginx processes, BTW.

Best regards,
-agentzh
2974d09ac2541e892966b762aad84943?d=identicon&s=25 bryndole (Guest)
on 2013-10-02 23:02
(Received via mailing list)
At that level of concurrency you are only testing the throughput of ab
and
not nginx. Apache bench is a single process and is too slow to test
anything
other than a single Apache instance.

In summary, ab is a piece of crap.

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,243246,243389#msg-243389
2974d09ac2541e892966b762aad84943?d=identicon&s=25 laltin (Guest)
on 2013-10-02 23:13
(Received via mailing list)
But while benchmarking with ab I test the site with a browser and I
experience high response times. What do you recommend for testing?

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,243246,243390#msg-243390
2974d09ac2541e892966b762aad84943?d=identicon&s=25 bryndole (Guest)
on 2013-10-02 23:43
(Received via mailing list)
It soulds like ab is succeeding in generating a load high enough to
impact
performance already. You need to figure out where the bottlenecks are.

My message is intended as a helpful hint that you should not trust the
stats
from ab. You need to be measuring perf on the nginx and backend servers.
Start by looking at the elapsed time in nginx and the response times of
the
backends.

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,243246,243391#msg-243391
6840f32c5556c888e7672541ece4791e?d=identicon&s=25 Oscar Fernandez Sierra (Guest)
on 2013-10-03 00:25
(Received via mailing list)
Hello.

Perhaps this page can be useful for you to choose tools for testing:

http://www.softwareqatest.com/qatweb1.html#LOAD

Greetings,

Oscar
2974d09ac2541e892966b762aad84943?d=identicon&s=25 laltin (Guest)
on 2013-10-04 16:49
(Received via mailing list)
My backend logs show response times generally less than 2ms and with 4
instances backends can handle 2000 reqs/sec.
But when I look at nginx logs I see response times around 200ms and i
think
nginx is the main problem with my situation.

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,243246,243405#msg-243405
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 Yichun Zhang (agentzh) (Guest)
on 2013-10-04 21:01
(Received via mailing list)
Hello!

On Fri, Oct 4, 2013 at 7:48 AM, laltin wrote:
> My backend logs show response times generally less than 2ms and with 4
> instances backends can handle 2000 reqs/sec.
> But when I look at nginx logs I see response times around 200ms and i think
> nginx is the main problem with my situation.
>

As I've suggested, use the off-CPU and on-CPU flamegraph tool to check
your loaded Nginx worker processes. It could easily be a configuration
issue or something. Please do that.

See

    https://github.com/agentzh/nginx-systemtap-toolkit#sample-bt

and

    https://github.com/agentzh/nginx-systemtap-toolkit...

Also, ensure your Nginx is not flooding your Nginx's error log file,
which is very expensive. If it is, you can also clearly see it in the
flame graphs anyway.

Stop guessing, start tracing! :)

Best regards,
-agentzh
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.