Reverse ssl proxy - speed & jitter

I am setting up a nginx reverse ssl proxy - I have a machine I can use
with
2 E5-2650 CPU’s and lots of RAM. I have nginx-1.6.0 + openssl-1.0.1h
installed. I have taken into consideration most optimization suggestions
out
there and incorporated them. I will attach a copy of my config file.

(optimizing first connection experience is good) With my testing just
for
handshake + connection setup with 2K cert it is taking 3.5ms on average.
I
see spikes in this time every 40 or so handshakes. I would like the 90+
percentile of the handshakes to not have any jitter/variance.

testing method:
time
for i in {1…1000};
do
httperf --hog --server localhost --port 443 --ssl --uri /nginx_ping
–ssl-no-reuse --num-calls 1 --num-conns 1 --rate 1 | egrep "Connection
time [ms]: |Reply time [ms]: " | awk {‘print $5’} | xargs | tr -s
" "
", " >> test.log;
done;

-if you think this methodology is not right - do let me know. I have
looked
at the tcpdumps and made sure a full handshake is happening and then a
GET
request is issued

gives me: request-time, connect_time, response_time
request_time = connect_time(ssl handshake + connection setup) +
response_time.

  1. I want to debug why there is jitter in the handshake time - i want
    the
    90th, 95th, 99th, 99.9th percentiles to also be around 3.5ms.
  2. I want to see if i can make nginx any faster to do handshake. what is
    the
    fastest you guys think this can happen
  3. how can i profile nginx and proceed to make this faster

all comments are welcome!

thanks!

not sure how to attach config:
config details:
5 workers, worker_priority -10, timer_resolution 200ms,
worker_cpu_affinity
to separates cores on cpu2, error_log to dev/null, use epoll,
worker_conns
2000, multi_accept on, accept_mutex off, sendfile on, tcp_nopush on,
tcp_nodelay on, file caches, keepalive_timeout 5000, keepalive_requests
100000, reset_timedout_connection on, client_body_timeout 10,
send_timeout
2, gzip, server_tokens off, postpone_output 0. upstream: keep alive 180,
proxy_buffering off, client_body_buffer_size 512K,
large_client_header_buffers 4 64k, client_max_body_size 0. server:
listen
443 ssl, access_log off, ssl_buffer_size 8k, ssl_session_timeout 10m,
ssl_protocols SSLv3 TLSv1, ssl_ciphers RC4-MD5,
ssl_prefer_server_ciphers
on, ssl_session_cache shared:SSL:10m. location /nginx_ping - return
200.

Posted at Nginx Forum:

Full Config:

#user nobody;

This number should be, at maximum, the number of CPU cores on your

system.

(since nginx doesn’t benefit from more than one worker per CPU.)

worker_processes 5;

#give worker processes the priority (nice) you need/wish, it calls
setpriority().
worker_priority -10;

#decrease number gettimeofday() syscalls. By default gettimeofday() is
called after each return from

kevent(), epoll, /dev/poll, select(), poll().

timer_resolution 200ms;

#trying to set CPU affinity
worker_cpu_affinity 10001 10010 10011 10100 10101;

#error_log LOGFILE [debug_core | debug_alloc | debug_mutex | debug_event
|
debug_http | debug_imap], debug, crit, emerg;
error_log /dev/null emerg;

pid var/state/nginx.pid;

Number of file descriptors used for Nginx. This is set in the OS with

‘ulimit -n 200000’ or using /etc/security/limits.conf
#worker_rlimit_nofile 60000;

workers_conns * 2

events {
use epoll;
worker_connections 20000;
# Accept as many connections as possible, after nginx gets
notification
about a new connection.
# May flood worker_connections, if that option is set too low.
multi_accept on;
accept_mutex off;
}

http {
default_type application/octet-stream;

log_format  main  '[$time_local] - [$time_iso8601] - [$request_time] 

[$upstream_response_time] - $remote_addr ’
#$proxy_add_x_forwarded_for_cip

’ “$http_x_forwarded_for” - $remote_user -
“$request”
[$request_time] ’
'$status - $request_length - $body_bytes_sent -
“$http_referer” - ’
'“$http_user_agent” - $uri - $request_method - ’
‘$ssl_protocol - $ssl_cipher’;

sendfile        on;

# Tcp_nopush causes nginx to attempt to send its HTTP response head 

in
one packet,
# instead of using partial frames. This is useful for prepending
headers
before calling sendfile,
# or for throughput optimization.
tcp_nopush on;

#the TCP_NODELAY option. The option is enabled only when a 

connection is
transitioned into the keep-alive state.
tcp_nodelay on;

# Caches information about open FDs, freqently accessed files.
# Changing this setting, in my environment, brought performance up 

from
560k req/sec, to 904k req/sec.
# I recommend using some varient of these options, though not the
specific values listed below.
open_file_cache max=200000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
open_log_file_cache max=100000 inactive=2m valid=10m min_uses=2;

keepalive_timeout  5000;

# Number of requests which can be made over a keep-alive connection.
# Review and change it to a more suitable value if required.
keepalive_requests 100000;

# allow the server to close the connection after a client stops

responding. Frees up socket-associated memory.
reset_timedout_connection on;

# send the client a "request timed out" if the body is not loaded by

this time. Default 60.
client_body_timeout 10;

# If the client stops reading data, free up the stale client 

connection
after this much time. Default 60.
send_timeout 2;

# Compression. Reduces the amount of data that needs to be 

transferred
over the network
gzip on;
gzip_min_length 10240;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript
application/x-javascript application/xml;
gzip_disable “MSIE [1-6].”;

client_body_temp_path var/state/nginx/client_body_temp 1 2;
proxy_temp_path var/state/nginx/proxy_temp 1 2;
fastcgi_temp_path var/state/nginx/fastcgi_temp 1 2;
uwsgi_temp_path var/state/nginx/uwsgi_temp 1 2;
scgi_temp_path var/state/nginx/scgi_temp_path 1 2;

server_tokens off;
postpone_output 0;

upstream downstream_service {
    server 127.0.0.1:9999;
    keepalive 180;
}

# Turn off proxy buffering
proxy_buffering off;
proxy_buffer_size 128K;
proxy_busy_buffers_size 128K;
proxy_buffers 64 4K;

client_body_buffer_size 512K;
large_client_header_buffers 4 64k;

limit_conn_zone $server_name zone=perserver1:32k;

# Allow arbitrary size client posts
client_max_body_size 0;

# HTTPS Server config
server {
    listen       443 ssl;

sndbuf=128k;

    server_name  test-domain.com;

# Buffer log writes to speed up IO, or disable them altogether
    access_log off;  # turn off for better performance

    ssl_certificate      /dev/shm/test-domain.com/cert;
    ssl_certificate_key  /dev/shm/test-domain.com/key;

Do not overflow the SSL send buffer (causes extra round trips)

ssl_buffer_size 8k;

    ssl_session_timeout  10m;

    ssl_protocols  SSLv3 TLSv1;

ssl_ciphers RC4-MD5;

    ssl_prefer_server_ciphers   on;
    ssl_session_cache   shared:SSL:10m;

    set $host_header $host;
    if ($http_host != "")
    {
        set $host_header $http_host;
    }

    location / {
        proxy_pass          http://downstream_service/;
        proxy_http_version  1.1;
        proxy_set_header    Connection "";
        proxy_set_header    Host $host_header;
        proxy_set_header    X-Real-IP $remote_addr;

proxy_set_header X-Forwarded-For

$proxy_add_x_forwarded_for_cip;
limit_conn perserver1 180;
}

    # Nginx health check only to verify server is up and running
    location /nginx_ping {
        return 200;
    }
}

}

Posted at Nginx Forum:

On Wednesday 23 July 2014 14:00:36 newnovice wrote:

Full Config:

[…]

You may get better results if you remove most of “optimizations” (as you
have
called that).

Please don’t collect bad advices. You shouldn’t use any directive
without
full understanding what it’s for according to official documentation
(nginx documentation), and what real problem you’re trying to
solve.

Those spikes can be a result of “multi_accept on”.

error_log /dev/null emerg;

Everyone who added this line to config should be fired.

send_timeout 2;

Mobile clients will blame you for this setting.

wbr, Valentin V. Bartenev

What is the fastest SSL connection setup time that anyone can achieve?
and
also how do i reduce the jitter/variance…

so what settings should I take out & what shud i have - do you have an
example of an optimal config?

the logs are turned off to see first if this is even a viable option, i
can
turn up debug and check stuff if needed.

Posted at Nginx Forum:

this is for a very fast internal only API driven service and not serving
webpages/static files/multimedia…

Posted at Nginx Forum: