Hey, we run a website of fairly decent volume.. up to nearly 4m
pageviews a day.
At the moment we run a single machine with nginx and mysql and two
worker machines with memcached and tornado instances. The nginx server
is a reverse proxy to the workers and also serves static media.
The CPU load and memory usage on both of the worker boxes are well
within reasonable expectations.
What I am observing is that nginx gets to about 320 requests per second
then requests start backing up. Sometimes taking the server down, see
this image: http://dl.dropbox.com/u/367355/nginx.png
When the server doesn't go down, we see a flattening of requests around
the 320 mark, and the number of "waiting" requests and the memory usage
of nginx spikes considerably.
I've tried upping the number of workers in case all of them are blocking
for long enough to cause this cascading effect (the tornado db driver is
not async) but didn't really see an improvement by adding more. I've
also added lots of async memcached access to avoid hitting the db too
much.
I've included the configs below.. thanks for any help you may have!
[code]
user www-data;
worker_processes 4;
worker_rlimit_nofile 32768;
error_log /dev/null crit;
pid /var/run/nginx.pid;
events {
worker_connections 8192;
use epoll;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /dev/null;
sendfile on;
keepalive_timeout 0;
tcp_nodelay on;
gzip on;
gzip_types text/css text/plain text/javascript
application/x-javascript application/json;
gzip_comp_level 5;
gzip_disable "msie6";
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
[/code]
[code]
upstream bar {
server worker1:8888 max_fails=1 fail_timeout=10s;
server worker2:8888 max_fails=1 fail_timeout=10s;
}
server { # simple reverse-proxy
listen 80;
server_name bar.net;
#access_log logs/bar.access.log;
access_log /dev/null;
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
location ^~ /static/ {
root /home/foo/bar;
if ($query_string) {
expires max;
}
}
# pass requests for dynamic content to tornado
location / {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect false;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_pass http://tweete;
}
error_page 411 /411.html;
location = /411.html {
root /home/foo/bar/static/error;
}
error_page 500 502 503 504 /500.html;
location = /500.html {
root /home/foo/bar/static/error;
}
}
[/code]
Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,123754,123754#msg-123754
on 2010-08-26 00:09
on 2010-08-26 00:10
I should note that I'm running 15 workers on each worker box... I just cut them out of the upstream section to make the config shorter. Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,123755#msg-123755
on 2010-08-26 01:01
Are you certain it's Nginx and not Tornado? You might try using # issue warning if we block for over 200ms tornado.ioloop.set_blocking_log_threshold (0.2) Also you don't mention how many Tornado backends you have. If you don't have at least one Tornado backend per Nginx worker, you are probably wasting your time trying to tune Nginx. As an aside, you might check out ngx_postgres or ngx_drizzle for async db access from Tornado (lets you use Tornado's async httpclient). Cliff On Wed, 2010-08-25 at 18:09 -0400, dpn wrote: > What I am observing is that nginx gets to about 320 requests per second > also added lots of async memcached access to avoid hitting the db too > error_log /dev/null crit; > > gzip_comp_level 5; > upstream bar { > > if ($query_string) { > proxy_set_header X-Scheme $scheme; > } > http://nginx.org/mailman/listinfo/nginx --
on 2010-08-26 01:42
Sorry, it seems my posts have been moderated.. I'll wait it out until someone spots it. :) Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,123774#msg-123774
on 2010-08-26 01:45
Hey Cliff, thanks for the reply. I mentioned in the second post to this thread that I have a total of 30 workers, 15 on each machine... there are 4 CPUs on each machine.. the extra processes are to pick up any slack from blocking DB access. I have indeed used the IO loop blocking debug... coming here really is a last resort for me! The IOLoop debugging showed some areas I could improve in, obviously the DB access is unavoidable, but there was also some CPU intensive spots I could debug which I did. To avoid too many DB accesses I'm using an async memcached driver. Now I'm in the situation where the IOLoop debugging issues hardly any messages, the CPU usage is fairly low, and I'm hardly touching the db! That's why I'm here ... unfortunately I've covered the things you've mentioned. :( D Cliff Wells Wrote: ------------------------------------------------------- > probably wasting your time trying to tune Nginx. > up to nearly 4m > worker boxes are well > flattening of requests around > > also added lots of async memcached access to > > > > include /etc/nginx/mime.types; > > gzip_types text/css text/plain > > > > server { # simple reverse-proxy > > deny all; > tornado > > location = /411.html { > > > -- > > > _______________________________________________ > nginx mailing list > nginx@nginx.org > http://nginx.org/mailman/listinfo/nginx Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,123772#msg-123772
on 2010-08-31 08:19
Hello again, we added another machine with 15 workers and are still getting the 320rps cap from nginx: http://dl.dropbox.com/u/367355/nginxday.png I've posted my config earlier... I don't suppose anyone has a suggestion about what might be causing the issue? Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,125474#msg-125474
on 2010-09-01 02:22
Kevin, thanks for your reply.. I've turned off keepalive because the app is a mobile app with very simple js and css. There is very little reason to have keepalive. I'll try putting it up to test though. Cheers! Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,125845#msg-125845
on 2010-09-01 02:29
Hi, > Kevin, thanks for your reply.. I've turned off keepalive because the app > is a mobile app with very simple js and css. There is very little reason > to have keepalive. I'll try putting it up to test though. Cheers! That .js and .css are good enough reason to have keepalive turned on. Regarding your original issue: how much time does it take to generate single response from Tornado? I'm asking, because you said that you've got 30 blocking workers, which means that if single response takes around 100ms, then you can handle only about 300req/s. Best regards, Piotr Sikora < piotr.sikora@frickle.com >
on 2010-09-01 03:09
Piotr, Most workers aren't blocking... they only block when they hit the db, and we are doing lots of caching. At its peak mysql is registering 140 requests per second. I've added more workers which has had no effect on the capacity going through nginx.. so either the DB itself is causing problems (unlikely since it is such a simple schema with no joins) or something is up with the nginx config. Thanks for giving this some thought! David Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,125864#msg-125864
on 2010-09-01 03:15
post to subscribe to the thread.. please disregard. Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,125870#msg-125870
on 2010-09-06 16:39
OK, I turned keepalive back on (5 seconds) and we are cooking with gas... getting up to 400 requests per second. Thanks everyone :) Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,126986#msg-126986
on 2010-09-06 16:39
Another update... with keepalive 5 the server gets to 450 requests per second then does the weird levelling off thing I've mentioned before. http://dl.dropbox.com/u/367355/nginx450.png Posted at Nginx Forum: http://forum.nginx.org/read.php?2,123754,127462#msg-127462
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.