Keep-alive message flood

Hi all,

I am running an application that uses the nginx_http_push_module-0.712
push module.

The relevant set up is
server {
listen 443 default ssl;
## SSL Certs
ssl on;
ssl_certificate /ssl_keys/coachmaster.co.uk.crt;
ssl_certificate_key /ssl_keys/coachmaster.co.uk.key;
ssl_ciphers HIGH:!ADH:!MD5;
ssl_prefer_server_ciphers on;
ssl_protocols TLSv1;
ssl_session_cache shared:SSL:1m;
ssl_session_timeout 5m;
#
server_name example.co.uk www.example.co.uk;
root /var/www/example.co.uk/htsecure;
access_log /var/www/example.co.uk/access.log;
index index.php;
#
# serve php via fastcgi if it exists
location ~ .php$ {
# try_files $uri =404;
include /etc/nginx/fastcgi_params;
fastcgi_pass 127.0.0.1:9000;
fastcgi_param SCRIPT_FILENAME
$document_root$fastcgi_script_name;
fastcgi_param CENTRAL_ROOT $document_root;
fastcgi_param RESELLER_ROOT $document_root;
fastcgi_param ENVIRONMENT production;
fastcgi_param HTTPS ON;
}
# serve static files
try_files $uri $uri/ /index.php ;
expires 30m;
# set up publish/subscribe
push_store_messages on;
location /publish {
push_publisher;
set $push_channel_id $arg_id;
push_message_timeout 10s;
push_max_message_buffer_length 30;
}
location /activity {
push_subscriber;
push_subscriber_concurrency broadcast;
set $push_channel_id $arg_id;
default_type text/plain;
}
}

The trouble is that this has been working fine for about a year, but a
new user is reporting problems. Everyone else continues to have no
problems.

He is reporting long delays in both IE11 and Chrome. He is in Germany
behind a corporate proxy/firewall. When he tries it from home, all works
OK, so the proxy/firewall is prime suspect.

For political reasons (quoting security - ha!) the proxy cannot be
altered.

Looking at the server logs from when he was having problems, he is
sending a lot of AJAX requests to the /activity URL, and receiving what
appear to be empty replies (with reply code 200 OK). These rattle
through at top speed.

After anything from 10 to 200 such exchanges, he gets a larger message
(also with 200 OK) that is so delayed it crashes the application with
a “Missing messages” report.

The jquery call his code it making, already has nocache, so the URL
contains a "&_= parameter to break any caching.

Oddly when I added a similar tweak to break the cache, I managed to
trigger the same behaviour. There was no proxy in my setup, and I have
disabled browser caching. My round trip was 6ms, so I did not see long
enough delays to crash things. However I could not find out what was
causing the problem so I took out my tweak. It only went in, because
I suspected the nocache was being ignored.

Ideas anyone? I’m stumped.

Anyone understand how a proxy could mess things up for Nginx? How
can I prove its the proxy? Crucially how can I compensate?

Thanks
Ian


Ian H.
Mid Auchentiber, Auchentiber, Kilwinning, North Ayrshire KA13 7RR
Tel: 0203 287 1392
Preparing eBooks for Kindle and ePub formats to give the best reader
experience.

Hey Ian,

I am not nginx expert but I would try by mimicking a similar setup with
a proxy to understand how it runs.
If you can get tcpdump dumps from the client and server side you might
be able to notice the different sides of the issue.
If your nginx server is doing it’s job then you will probably see it in
the server dumps.
In any case when you will have a tcpdump you might even discover that
there is something that the proxy does to the request which will result
in understanding what is actually the issue…

In general a proxy can mess things up in too many ways then you can
anticipate.
If you have both the client and server side requests\responses dumps you
will have the bigger picture in hands.

If you need the exact tcpdump commands I will be happy to help you with
it.

Eliezer