Nginx ssl slow

According to this article http://matt.io/entry/uq nginx is really slow
at SSL. Is this true and should I be using stud to handle SSL
connections, or is nginx actually fast an it’s a configuration issue or
a fluke version of nginx?

Posted at Nginx Forum:

Hello!

On Mon, Jul 11, 2011 at 08:45:07PM -0400, davidkazuhiro wrote:

According to this article http://matt.io/entry/uq nginx is really slow
at SSL. Is this true and should I be using stud to handle SSL
connections, or is nginx actually fast an it’s a configuration issue or
a fluke version of nginx?

Quick test suggests that it’s probably about new ECDHE ciphers
being cool compared to DHE ones while nginx doesn’t support ECDHE
out of the box (yet, there are patches floating around) even if
compiled with OpenSSL 1.0.

Using identical ciphers gives mostly identical results (no
surprise). And obviously all these tests are far away from real
life where ECDHE isn’t really used, while keepalive connections
and session cache matters.

Maxim D.

Wait I’m confused… how do you know these tests were done with EDCHE
ciphers? And if they were, how did he do them if nginx doesn’t support
EDCHE?

Posted at Nginx Forum:

On Jul 12, 2011, at 4:45 , davidkazuhiro wrote:

According to this article http://matt.io/entry/uq nginx is really slow
at SSL. Is this true and should I be using stud to handle SSL
connections, or is nginx actually fast an it’s a configuration issue or
a fluke version of nginx?

I believe nginx was not configured to run 8 worker processes.
It seems he ran only 2 worker processes.


Igor S.
http://sysoev.ru/en/

Until someone can chime in with better advise:

Have a look at the comments on the HackerNews thread:

They propose some issues with the testing methodology, in particular the
configuration option “ssl_session_cache” and keepalive_* may have not
been
adjusted from their default values.

Igor I did SSL benchmarks with 10 worker processes on a very fast
multicore machine with multiple ssl_session_cache configs to try and
disprove this post. My results were also slow:

On a 4 core Xeon E5410 using:

ab -c 50 -n 5000

with 64 bit ubuntu 10.10 and kernel 2.6.35 I get:

For a 43 byte transparent gif image on regular HTTP:

Requests per second: 11703.19 [#/sec] (mean)

Same file via HTTPS with various ssl_session_cache params set:

ssl_session_cache shared:SSL:10m;
Requests per second: 180.13 [#/sec] (mean)

ssl_session_cache builtin:1000 shared:SSL:10m;
Requests per second: 183.53 [#/sec] (mean)

ssl_session_cache builtin:1000;
Requests per second: 182.63 [#/sec] (mean)

No ssl_session_cache:
Requests per second: 184.67 [#/sec] (mean)

I’m assuming the session cache has no effect since each ‘ab’ request is
a new session. But I thought I’d try it anyway.

180 per second for a machine this fast compared to 11,703 per second on
regular HTTP seems like a big difference. ‘ab’ was run on the local
machine (it takes very little CPU) so there was zero network latency.

Let me know if there’s anything I should try to speed it up.

Here’s the config I used:

worker_processes 10;
worker_rlimit_nofile 60000;
error_log logs/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 10000;
}
http {
client_max_body_size 20m;
client_header_timeout 3m;
client_body_timeout 3m;
send_timeout 3m;
server_names_hash_bucket_size 128;
client_header_buffer_size 1k;
large_client_header_buffers 4 4k;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
server_tokens off;
gzip on;
gzip_min_length 1100;
gzip_buffers 4 8k;
gzip_types text/plain text/css application/x-javascript
application/javascript text/xml application/xml application/xml+rss
text/javascript;
keepalive_timeout 10 5;

proxy_next_upstream off;

geo $country {
default no;
include mygeodir/nginxGeo.txt;
}
limit_req_zone $binary_remote_addr zone=slowSite:20m rate=10r/m;
limit_req_zone $binary_remote_addr zone=fastSite:20m rate=500r/m;
limit_req_zone $binary_remote_addr zone=zonea:20m rate=120r/m;
limit_req_zone $binary_remote_addr zone=zoneb:20m rate=60r/m;

include mime.types;

#the rest is basic server sections
}


Mark M.[email protected]
http://feedjit.com/

Hello!

On Tue, Jul 12, 2011 at 01:10:38PM +0400, Maxim D. wrote:

And if they were, how did he do them if nginx doesn’t support
$ http_load -cipher CAMELLIA256-SHA -parallel 10 -seconds 10 stunnel
$ http_load -cipher CAMELLIA256-SHA -parallel 10 -seconds 10 nginx

CAMELLIA256-SHA cipher used. On the other hand, using default

(actually used cipher: DHE-RSA-AES256-SHA)

Here you can see that ECDHE cipher is about 2x times faster
compared to DHE. I believe this is what actually was observed by
author of test you’ve referenced. Both are 3x times slower than
CAMELLIA256-SHA as shown above though.

And again, disclaimer: all of the above tests ssl handshaking
speeds, not real https workload. Real workloads are expected
to be much different.

Just for completenes, results with the ECDH patch[1] using the
same ECDHE-RSA-AES256-SHA cipher as in stunnel case above:

$ http_load -parallel 10 -seconds 10 nginx
279 fetches, 10 max parallel, 11997 bytes, in 10.018 seconds
43 mean bytes/connection
27.8498 fetches/sec, 1197.54 bytes/sec
msecs/connect: 1.63012 mean, 37.961 max, 0.272 min
msecs/first-response: 206.536 mean, 604.134 max, 62.889 min
HTTP response codes:
code 200 – 279

(the above disclaimer still applies)

[1] ECDHE key exchange with TLSv1

Maxim D.

Hello!

On Tue, Jul 12, 2011 at 01:58:38AM -0600, Mark M. wrote:

For a 43 byte transparent gif image on regular HTTP:

ssl_session_cache builtin:1000;
Requests per second: 182.63 [#/sec] (mean)

No ssl_session_cache:
Requests per second: 184.67 [#/sec] (mean)

I’m assuming the session cache has no effect since each ‘ab’ request
is a new session. But I thought I’d try it anyway.

Yes, ab won’t reuse sessions.

180 per second for a machine this fast compared to 11,703 per second
on regular HTTP seems like a big difference. ‘ab’ was run on the
local machine (it takes very little CPU) so there was zero network
latency.

I’ve did some tests on 2 x X5355 (4 cores each, 8 cores total)
server, it should be comparable to yours E5410. I’ve used
empty_gif to test as well.

First of all, ab wasn’t even able to saturate regular http while
eating 100% cpu (i.e. the whole cpu core, it just can’t eat more
as it’s single thread/single process). That is, it only shows
something about 13k r/s, while with 5 ab processes nginx is
actually able to handle 50k r/s over loopback.

So about “ab takes very little CPU”: no it doesn’t, it’s awfully
CPU bound. If you see low numbers in top - make sure top shows
%CPU for a core, not for all cores in system, or you’ll see small
number like 12.5% (100%/8) for a “whole core loaded, can’t eat
more” case. Under linux it should be switchable by pressing ‘I’
(Irix mode vs. Solaris mode).

Let me know if there’s anything I should try to speed it up.

The same as the above applies to https as well. While using 1024
bit RSA key and DHE-RSA-AES256-SHA cipher, with 8 ab processes
from another host I see 1800 r/s while system being 100% busy.

Other results include:

1024 bit key, DHE-RSA-AES256-SHA - 1800 r/s
2048 bit key, DHE-RSA-AES256-SHA - 1050 r/s
4096 bit key, DHE-RSA-AES256-SHA - 270 r/s

With ECDHE ciphers (and patch already mentioned in this thread):

1024 bit key, ECDHE-RSA-AES256-SHA - 2740 r/s
2048 bit key, ECDHE-RSA-AES256-SHA - 1340 r/s
4096 bit key, ECDHE-RSA-AES256-SHA - 285 r/s

This is with fairy trivial nginx config:

worker_processes  8;

error_log /path/to/error_log;

events {
    worker_connections  10240;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    access_log /path/to/access_log;

    server {
        listen       8443;
        server_name  localhost;

        ssl                  on;
        ssl_certificate      cert.pem;
        ssl_certificate_key  cert.key;

        location / {
            empty_gif;
        }
    }
}

Obviously using other cipher suites will produce much different
results.

Just to compare, here are results from stunnel on the same
machine:

1024 bit key, DHE-RSA-AES256-SHA - 1990 r/s
2048 bit key, DHE-RSA-AES256-SHA - 1220 r/s
4096 bit key, DHE-RSA-AES256-SHA - 280 r/s

1024 bit key, ECDHE-RSA-AES256-SHA - 2285 r/s
2048 bit key, ECDHE-RSA-AES256-SHA - 1223 r/s
4096 bit key, ECDHE-RSA-AES256-SHA - 285 r/s

It looks a bit faster with DHE ciphers, and the reason is not
using SSL_OP_SINGLE_DH_USE option by default. Setting “options
SINGLE_DH_USE” in config results in the following DHE performance
of stunnel:

1024 bit key, DHE-RSA-AES256-SHA - 1480 r/s
2048 bit key, DHE-RSA-AES256-SHA - 953 r/s
4096 bit key, DHE-RSA-AES256-SHA - 260 r/s

Maxim D.

Hello,

On Thu, Jul 14, 2011 at 8:19 AM, Maxim D. [email protected]
wrote:

Hello!

On Tue, Jul 12, 2011 at 01:58:38AM -0600, Mark M. wrote:

Igor I did SSL benchmarks with 10 worker processes on a very fast
multicore machine with multiple ssl_session_cache configs to try and
disprove this post. My results were also slow:

Update from the author:

http://matt.io/technobabble/hivemind_devops_alert:_nginx_does_not_suck_at_ssl/ur
.

Also of interest:
Unlike the above post, this fellow actually did some broad cipher testing (http:... | Hacker News.

Hello!

On Tue, Jul 12, 2011 at 01:39:33AM -0400, davidkazuhiro wrote:

Wait I’m confused… how do you know these tests were done with EDCHE
ciphers?

This is just a guess based on ciphers OpenSSL 1.0.0d prefers by
default when working with nginx and stunell.

And if they were, how did he do them if nginx doesn’t support
EDCHE?

Forcing cipher to be what equally supported by all programs
tested. Good testing programs even have switches to specify that.
:slight_smile:

E.g. numbers are from virtual machine on my poor old P4 laptop,
stunnel passing connections to nginx, using cipher as selected by
my browser during real work:

$ http_load -cipher CAMELLIA256-SHA -parallel 10 -seconds 10 stunnel
540 fetches, 10 max parallel, 23220 bytes, in 10.008 seconds
43 mean bytes/connection
53.9568 fetches/sec, 2320.14 bytes/sec
msecs/connect: 2.12899 mean, 24.401 max, 0.196 min
msecs/first-response: 105.195 mean, 414.064 max, 23.386 min
HTTP response codes:
code 200 – 540

And here is nginx proxy_pass’ing to itself, same cipher:

$ http_load -cipher CAMELLIA256-SHA -parallel 10 -seconds 10 nginx
766 fetches, 10 max parallel, 32938 bytes, in 10.0081 seconds
43 mean bytes/connection
76.538 fetches/sec, 3291.13 bytes/sec
msecs/connect: 1.62532 mean, 22.692 max, 0.262 min
msecs/first-response: 79.0284 mean, 239.204 max, 21.643 min
HTTP response codes:
code 200 – 766

And as a reference point, direct requests to non-ssl nginx (used
as backend in both tests above):

$ http_load -parallel 10 -seconds 10 nossl
7536 fetches, 10 max parallel, 324048 bytes, in 10.0008 seconds
43 mean bytes/connection
753.542 fetches/sec, 32402.3 bytes/sec
msecs/connect: 0.70163 mean, 30.059 max, 0.02 min
msecs/first-response: 6.044 mean, 48.126 max, 0.281 min
HTTP response codes:
code 200 – 7536

So you may see nginx is a bit faster than stunnel when
CAMELLIA256-SHA cipher used. On the other hand, using default
ciphers would produce something like this:

$ http_load -parallel 10 -seconds 10 stunnel
243 fetches, 10 max parallel, 10449 bytes, in 10.0243 seconds
43 mean bytes/connection
24.2411 fetches/sec, 1042.37 bytes/sec
msecs/connect: 2.03381 mean, 18.384 max, 0.188 min
msecs/first-response: 239.767 mean, 628.098 max, 68.431 min
HTTP response codes:
code 200 – 243

(actually used cipher: ECDHE-RSA-AES256-SHA)

$ http_load -parallel 10 -seconds 10 nginx
144 fetches, 10 max parallel, 6192 bytes, in 10.0126 seconds
43 mean bytes/connection
14.3818 fetches/sec, 618.418 bytes/sec
msecs/connect: 1.44656 mean, 12.673 max, 0.427 min
msecs/first-response: 395.734 mean, 836.928 max, 124.105 min
HTTP response codes:
code 200 – 144

(actually used cipher: DHE-RSA-AES256-SHA)

Here you can see that ECDHE cipher is about 2x times faster
compared to DHE. I believe this is what actually was observed by
author of test you’ve referenced. Both are 3x times slower than
CAMELLIA256-SHA as shown above though.

And again, disclaimer: all of the above tests ssl handshaking
speeds, not real https workload. Real workloads are expected
to be much different.

Maxim D.

Thanks Adam for the post

And thanks Maxim for the tip, your guess was correct.

Quick test suggests that it’s probably about new ECDHE ciphers being
cool compared to DHE ones while nginx doesn’t support ECDHE out of the
box (yet, there are patches floating around) even if compiled with
OpenSSL 1.0.

Curiosity satisfied :slight_smile:

Posted at Nginx Forum: