Nginx & FastCGI buffering to slow clients, thousands of connections backup to fast-cgi/php processes

I assume this has been covered before but despite lots of searching,
cannot find it.

We run many nginx servers with fast-cgi for PHP, but here in China the
connections can be slow or dynamic, so suddenly we have 1000 connections
in ‘write’ status in Nginx. This should be no problem, but these seem
to be backing up the fast-cgi processes, eventually running out and the
whole system locks up from a users’ perspective, with 502 errors since
nginx can’t find any more fast-cgi processes to talk with (they are all
busy).

Errors we get are usually "upstream timed out (110: Connection timed
out) while reading response header from upstream, client: 61.149.175.16,
server: 121.13.44.145, request: "GET /…

In this scenario, we’d need 1000+ fast-cgi processes to handle all the
open connections. We’d prefer to run 10-20 php connections, which can
easily handle all the performance needs.

We thought, this is simple, just add more buffering so all the PHP
output is in memory in Nginx, and it will close the connection to PHP
and another user can use it. These are big servers, with 8-16 cores and
24GB+ RAM so we have plenty of power and memory. So we added bigger
buffers to 64KB and added thousands of buffers, etc. but Nginx’s memory
size didn’t really increase (very small at 30-50MB) and the problem
didn’t go away. And we have no buffering to disk messages in the logs.

So if the buffers are big enough and/or we have disks space, I am
thinking nginx will ALWAYS buffer ALL the fastcgi data and the
connection will close, so we should NEVER see fastcgi waiting for nginx
to write data to a client - is this correct?

First, are these buffer settings per connection or for all connections ?
I assume fastcgi_buffer_size is per connection. But if fastcgi_buffers
is per connection, why have a buffer count, why not just say 32K, 64K,
etc.? So I’m guessing this is the total buffers available to the
server, in blocks of the buffer size, for example fastcgi_buffers 1024
64k gives me 64MB of total buffer space. If I have 100 connections, I
can buffer about 600KB each, etc. before nginx starts buffering to
files.

Without a fix we are running 1000 cgi processes and 1-2K nginx
connections. This works but if we ever get real load, we’ll have a 250
load average, like we used to see on loaded Apache systems. We need a
way to use 10-20 PHP processes on a few thousand slow connections.

I assume we have buffering problems, but maybe there is a close or other
issue that prevents the php from being re-used, but this works great
when the connections are fast and the % of writers is small.

The fastcgi engine is php5-cgi; maybe we should use spawn-cgi from
lighttpd.

Key configis are:

events {
worker_connections 4096;
use epoll;
multi_accept off;
}

http {
sendfile on;
#tcp_nopush on;

#keepalive_timeout  0;
keepalive_timeout  15;
tcp_nodelay        on;

server {
listen 80;
server_name 120.136.43.145 abc.com.cn 127.0.0.1;

    root /var/www/abc;

    access_log /var/log/nginx/abc.com_access.log;
    error_log /var/log/nginx/abc.com_error.log;

    index  index.html index.php index.htm;

    location ~ \.php$ {
            fastcgi_pass   127.0.0.1:9000;
            fastcgi_index  index.php;
            fastcgi_buffer_size 64k;
            fastcgi_buffers 4096 64k;
            fastcgi_param  SCRIPT_FILENAME 

/var/www/abc$fastcgi_script_name;
include fastcgi_params;
}

}

Posted at Nginx Forum:

Hello!

On Sun, Dec 20, 2009 at 08:49:14AM -0500, mushero wrote:

busy).
PHP output is in memory in Nginx, and it will close the
waiting for nginx to write data to a client - is this correct?
Yes. As long as you have no fastcgi_max_temp_file_size set to
something low - nginx will always buffer full response from
fastcgi. It will release backend connection as soon as possible,
so slow clients would not produce any extra load for backends.

First, are these buffer settings per connection or for all
connections ? I assume fastcgi_buffer_size is per connection.
But if fastcgi_buffers is per connection, why have a buffer
count, why not just say 32K, 64K, etc.? So I’m guessing this is

Both fastcgi_buffer_size and fastcgi_buffers are per connection.

Having multiple fastcgi_buffers allows better granularity of
allocations, so for small responses you may save some memory which
is not needed. On the other hand, having many small buffers
imposes some overhead (cpu for processing, network as ocassionally
tcp may have to send non-full packets, disk seeks in case nginx
will have to use disk buffering).

the total buffers available to the server, in blocks of the
buffer size, for example fastcgi_buffers 1024 64k gives me 64MB
of total buffer space. If I have 100 connections, I can buffer
about 600KB each, etc. before nginx starts buffering to files.

No.

Without a fix we are running 1000 cgi processes and 1-2K nginx
connections. This works but if we ever get real load, we’ll
have a 250 load average, like we used to see on loaded Apache
systems. We need a way to use 10-20 PHP processes on a few
thousand slow connections.

I assume we have buffering problems, but maybe there is a close
or other issue that prevents the php from being re-used, but
this works great when the connections are fast and the % of
writers is small.

I think it’s a good idea to start from looking into process states
and/or traces to find out where bottleneck is.

    listen 80;
            fastcgi_pass   127.0.0.1:9000;
            fastcgi_index  index.php;

Just a note: fastcgi_index is noop here.

            fastcgi_buffer_size 64k;
            fastcgi_buffers 4096 64k;

Just a note: this gives up to 256M per connection, up to 1T in
total assuming you are runing 1 nginx worker process.

Maxim D.