Server performance issue

Hi all,

For yesterday, my server is getting slower and slower…
Sometime php-fpm is taking up 100% cpu, and there are about 1G free
memory.
Everytime I restart nginx and php-fpm, webserver will be OK. But after
about
15minutes, it’ll be very slow…

My server is running debian, and web server is
Nginx+php-fpm+memcached+mysql.
Hardware environment is I3 and 4G memory. It’s that it’s not hardware
problem.

And below is my configurations, anyone can give me a suggestion about
this
issue? How can I optimize server performance?

Nginx.conf

user www-data;
worker_processes 4;
worker_cpu_affinity 0001 0010 0100 1000;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
use epoll;
worker_connections 10240;
multi_accept on;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /var/log/nginx/access.log;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
server_names_hash_bucket_size 128;
server_name_in_redirect off;
server_tokens off;
keepalive_timeout 60;
send_timeout 60;
client_header_buffer_size 4k;
large_client_header_buffers 4 4k;
client_max_body_size 20m;
gzip on;
gzip_min_length 1k;
gzip_buffers 4 16k;
gzip_http_version 1.0;
gzip_comp_level 9;
gzip_types text/plain text/css image/x-icon image/bmp
application/x-javascript application/xml;
gzip_vary on;
fastcgi_connect_timeout 300;
fastcgi_send_timeout 300;
fastcgi_read_timeout 300;
fastcgi_buffer_size 64k;
fastcgi_buffers 4 64k;
fastcgi_busy_buffers_size 128k;
fastcgi_temp_file_write_size 128k;
include /etc/nginx/conf.d/.conf;
include /etc/nginx/sites-enabled/
;
}

php5-fpm.conf

listen = 127.0.0.1:9000
user = www-data
group = www-data
pm = dynamic
pm.max_children = 66
pm.start_servers = 8
pm.min_spare_servers = 5
pm.max_spare_servers = 16
pm.max_requests = 512

Im persuming you have an opcode cache?

Also try using php-fpm in static mode, dynamic mode resulted in
502/504’s
and high CPU usage for me.

On 26 Out 2010 07h42 WEST, [email protected] wrote:

[1 <multipart/alternative (7bit)>]
[1.1 <text/plain; UTF-8 (7bit)>]
Hi all,

For yesterday, my server is getting slower and slower… Sometime
php-fpm is taking up 100% cpu, and there are about 1G free memory.
Everytime I restart nginx and php-fpm, webserver will be OK. But
after about 15minutes, it’ll be very slow…

Try using an init script controlling php-cgi instead of php-fpm.

— appa

I tried to use static mode, and set pm.max_children = 128.
Then php-fpm will take up 100% cpu.
My question is that, for a server with I3 and 4G memory, what’s the
right
parameter for nginx.conf and php-fpm.conf.

fpm generally gives much better performance than php-cgi. I think there
is
benchmarks somewhere to prove it.

well 128 children is too much, work out the average response time of
your
script then use this formulae.

requests per second * average response time = number of workers or max
children

Generally a figure no higher than 30 is needed unless your script needs
improvement.

Ok although I dont know the scale of your site that formulae always
works, A
decent response time for an application is 0.05s, even most forums will
return 0.2s so if you are getting values higher than that then your
problem
is in your script.

This isnt really a nginx problem, its more PHP related.

It’s getting worse and worse.

From php-fpm.log, there is no error.
But from nginx error.log, there are lots of errors

[error] 21819#0: *16895 connect() to unix:/tmp/php-cgi.sock
failed (11: Resource temporarily unavailable) while connecting to
upstream,
client: 119.145.137.49, server: localhost,…

Sorry didn’t read you config before posting, you already have this. :slight_smile:

You probably need to reduce the number of child processes, more is not
better.

On intensive PHP applications I’ve found lower is better.

For example, if you have 100 concurrent connections, this doesn’t mean
you necessarily need 100 PHP-FPM children.

Setting the FPM max requests param to a low number might help (default
of
500 is fine).

I’ve found when benchmarking that the PHP processes don’t release memory
very well and have a tendency to grow bigger and bigger, while getting
slower and taking longer to process requests in the process.

After changing this setting I was able to push several 100,000 requests
at
the server and maintain a consistent response time and memory usage.

Your formulas look interesting, will give them a try next time I’m doing
some optimising.
Normally I would just make a rough guess based off req/sec, execution
times
etc and fine tune with ab.

ive found that if you have no blocking functions in php, aka its CPU
bound
then number of CPU cores+1 is the most efficient. If it has mysql (most
likely does) and other IO bound operations then *2 or *3 is fine. values
between 20-30 are common in decent scale web servers, in fact on my i7
8gb
ram I run 20, on my amd x2 4gb ram I run 15.

How busy is your site? Have you had a recent spike in traffic?

I would suggest looking at the response times from mysql. In
particular,
turn on the slow query log and set the long query time to 1 second or
even
less to see what comes up (you need at least mysql 5.1 to set it to
non-integer values). You can use a tool like “mk-query-digest” from
maatkit
to summarize the slow log and get a sense of what may be holding up your
php
processes.

You could also profile your php application using xdebug and view the
resulting files using a grind tool to determine where php is spending
most
of its time. Profiling will most definitely crash your application,
especially since you’re already seeing 100% CPU utilization but you can
run
it for a short period of time and review the resulting grind files.

Its unlikely that this is an Nginx problem.

On 26 October 2010 12:36, Ilan B. [email protected] wrote:

You could also profile your php application using xdebug and view the
resulting files using a grind tool to determine where php is spending most
of its time. Profiling will most definitely crash your application,
especially since you’re already seeing 100% CPU utilization but you can run
it for a short period of time and review the resulting grind files.

You can set in the ini:

xdebug.profiler_enable=0
xdebug.profiler_enable_trigger=1

Then add XDEBUG_PROFILE to any URL to profile it.

Webgrind is a nice grind viewer: Google Code Archive - Long-term storage for Google Code Project Hosting.

It would be a really bad idea to run either of these on a production
server,
especially one that’s under high load, the profiling process is
incredibly
resource intensive.

Even I set the pm.max_children=50, I still got lots of errors…

Oct 26 22:09:36.354159 [WARNING] [pool www] seems busy (you may need to
increase start_servers, or min/max_spare_servers), spawning 8 children,
there are 0 idle, and 12 total children

Oct 26 22:09:37.354215 [WARNING] [pool www] seems busy (you may need to
increase start_servers, or min/max_spare_servers), spawning 16 children,
there are 0 idle, and 17 total children

Oct 26 22:11:20.650232 [WARNING] [pool www] seems busy (you may need to
increase start_servers, or min/max_spare_servers), spawning 8 children,
there are 1 idle, and 17 total children

Oct 26 22:11:41.246244 [WARNING] [pool www] seems busy (you may need to
increase start_servers, or min/max_spare_servers), spawning 16 children,
there are 0 idle, and 21 total children

: Phil B. [mailto:[email protected]]
ʱ: 20101026 19:07
ռ: [email protected]
: Re: server performance issue

Your formulas look interesting, will give them a try next time I’m doing
some optimising.

Normally I would just make a rough guess based off req/sec, execution
times
etc and fine tune with ab.

On 26 October 2010 11:58, SplitIce [email protected] wrote:

ive found that if you have no blocking functions in php, aka its CPU
bound
then number of CPU cores+1 is the most efficient. If it has mysql (most
likely does) and other IO bound operations then *2 or *3 is fine. values
between 20-30 are common in decent scale web servers, in fact on my i7
8gb
ram I run 20, on my amd x2 4gb ram I run 15.

On Tue, Oct 26, 2010 at 9:52 PM, Phil B.
[email protected]
wrote:

Sorry didn’t read you config before posting, you already have this. :slight_smile:

You probably need to reduce the number of child processes, more is not
better.

On intensive PHP applications I’ve found lower is better.

For example, if you have 100 concurrent connections, this doesn’t mean
you
necessarily need 100 PHP-FPM children.


nginx mailing list
[email protected]
http://nginx.org/mailman/listinfo/nginx


Warez Scene http://thewarezscene.org Free Rapidshare Downloads
http://www.nexusddl.com

2010/10/26 Xin L. [email protected]:

there are 0 idle, and 17 total children

Oct 26 22:11:20.650232 [WARNING] [pool www] seems busy (you may need to
increase start_servers, or min/max_spare_servers), spawning 8 children,
there are 1 idle, and 17 total children

Oct 26 22:11:41.246244 [WARNING] [pool www] seems busy (you may need to
increase start_servers, or min/max_spare_servers), spawning 16 children,
there are 0 idle, and 21 total children

switch to static process management, dynamic “pm” is a great feature
to share ressources between several sites hosted on one server. If you
have only one site on your server, you should use static PM.

Can you try running ab with say 100 or so requests and post the output?

2010/10/26 Xin L. [email protected]

You’re issue is probably not with php-fpm. Without knowing more about
your
architecture, these are key points you should look at:

  1. What is your PHP code doing? What is it waiting for? Tools: xdebug (
    http://www.xdebug.org/) and you could even try tracking it via the
    php-fpm
    slow log configuration option:

    • request_slowlog_timeout - The timeout (in seconds) for serving of
      single request after which a php backtrace will be dumped to slow.log
      file.
      Default: “5s”. Note: ‘0s’ means ‘off’
    • slowlog - The log file for slow requests.
  2. How is your database performing? Are there issues with queries?
    again,
    look at the slow log query output of mysql. Look at connection issues
    and
    name resolution with mysql (its notoriously bad at host name
    resolution).

  3. If your site is really busy, you may need to add more PHP application
    servers to handle the load.

Again, this is not a Nginx or PHP-FPM problem per se.

2010/10/26 Xin L. [email protected]

Even I set the pm.max_children=50, I still got lots of errors……
Oct 26 22:09:36.354159 [WARNING] [pool www] seems busy (you may need to
increase start_servers, or min/max_spare_servers), spawning 8 children,
there are 0 idle, and 12 total children

It’s not an errror.
The process manager informs you that the innitial start_server (or
min_spare) is just too low and it will spawn new childs to satisfy all
the
incomming requests…

The only Warning you should be worried about is something like this:
Sep 15 17:41:26.901870 [WARNING] [pool www] server reached max_children
setting (70), consider raising it

As that indicates there are no free children and no more will be
spawned.

If you don’t like such log messages you can increase the
pm.start_servers (I
have 20 for example) or disable the Warning messages by setting the
log_level = error

switch to static process management, dynamic “pm” is a great feature
to share ressources between several sites hosted on one server. If you
have only one site on your server, you should use static PM.

In case you have a php script <? sleep(60); ?> no process manager or
configuration will satisfy the wanted req/s if there are more clients
(as
the proposed 50 child count) as you end up with each request taking 1
min to
finish. So switching the process manager while could lesser the warning
log
entry amount it doesnt really solve the issue (if the cause is slow
application or backends).

As people stated if there are no FPMs slow logs provided the requests
may be
as well hanging on a single DB connection and neither nginx nor php
itself
(other than FPM killing long running scripts
(‘request_terminate_timeout’
setting)) can work arround and its hard to give any advices.

The dynamic process manager allows you to somewhat automatically adjust
the
best child count (you can set static later if you really preffer that)
and
detect the (disaster) events when suddenly something goes wrong (manager
spawns a lot of new childs) but in normal scenario it allows you to
consume
as less system resources as possible.

rr

On 26/10/2010 10:30, SplitIce wrote:

It's getting worse and worse.
From php-fpm.log, there is no error.
But from nginx error.log, there are lots of errors


[error] 21819#0: *16895 connect() to unix:/tmp/php-cgi.sock
failed (11: Resource temporarily unavailable) while connecting to
upstream, client: 119.145.137.49, server: localhost,......

I think this means that nginx has tried to pass a request to php, but
php is unable to accept it due to having too much work to do. It can’t
make any more workers, and so it has to wait for a worker to finish
before it can accept the request.

Most probable is that the most recent change to the php script has
caused some form of looping or a very inefficient lookup in the database
(Has in index been removed recently?)

Try going back to the php code and database before the last change, and
seeing if that improves things.

Regards

Ian