Can anyone recommend some good reading material on scaling a Rails
app? we receive around 5k-7k visitors per day and are running
postgres and rails with fastcgi - we have not implemented caching yet
and are pondering moving to mongrel. We have thrown more hardware at
our application and seemed to help a bit - but we are looking for the
most optimal growth plan and would love any thoughts or advice or case
studies anyone has had - thanks for your time in posting!
Mongrel is recommended, see: http://mongrel.rubyforge.org/
Normally, what you will do is run multiple copies of mongrel behind a
reverse proxy. The Apache proxying support (which is what most people
use) appears to scale to a point. Beyond that you might have to look at
other options. You should also move static resources (such as CSS,
JPEG, PNG, and static HTML) so they are served by the web server as
opposed to the app server (as you have probably configured for fastcgi).
SQL query tuning is one thing that most people neglect. First, make
sure the columns you are using in relationships are indexed. Make sure
any columns you are using in finders (i.e. MyModel.find_by_name(name))
are also indexed. Rails 2.0 should do query caching, but look for
unnecessary trips to the database in your application. Proper care and
feeding of postgres is also important. In one extreme PHP example, a
simple change in the logic reduced the database queries by a factor of 5
and the response time on one particular page from almost a minute to
about 1 second.
Other associated services. For example, for one client we’re forced to
use SMTP mailer connections. The SMTP server has a slow response to
each request, so there’s a perceptible delay when sending mail. To
compensate we wrote a small service to send mail asynchronously. This
may be true of other issues, like shared file systems, etc.
Generally, your pipe to the outside world does not require gigabit
networking (your limited by the size of your external pipe). Between
your database and your app server, if you have gigabit links, make sure
you use CAT 6 cable and your switch supports gigabit networking. The
same can be said for NFS connections.
Thanks for the quick response - so right now we are running the
database, app and web services all on one server - perhaps time for us
to break out to three servers? We have indexed all our databases
but could return to the code and ensure we are being efficient. From
what I have read more people recommend more and more hardware - but I
dont understand the relationship between site activity and processing
power. Our current setup is too slow - and its hurting business. How
hard is it to migrate from fastCGI to mongrel?
Hi Ezra - thanks for the link - can you give any general guidance
while in the mean time - will take a while to order and read your
book
On Feb 22, 2008, at 11:56 AM, Marc wrote:
Can anyone recommend some good reading material on scaling a Rails
app? we receive around 5k-7k visitors per day and are running
postgres and rails with fastcgi - we have not implemented caching yet
and are pondering moving to mongrel. We have thrown more hardware at
our application and seemed to help a bit - but we are looking for the
most optimal growth plan and would love any thoughts or advice or case
studies anyone has had - thanks for your time in posting!
May I humbly suggest getting a copy of my book that was just finished
yesterday? http://pragprog.com/titles/fr_deploy It covers taking a
rails app form infancy to maturity and covers all the topics of
scaling out like apache/nginx/mongrel as well as Xen and mysql master -
slave and master ↔ master.
Cheers-
- Ezra Z.
– Founder & Software Architect
– [email protected]
– EngineYard.com
Normally, what you will do is run multiple copies of mongrel behind a
reverse proxy. The Apache proxying support (which is what most people
use) appears to scale to a point. Beyond that you might have to look at
other options.
Nginx.
You should also move static resources (such as CSS, JPEG, PNG, and
static HTML) so they are served by the web server as opposed to the app
server (as you have probably configured for fastcgi).
You may want to look into spreading your assets to other hosts and using
the ‘asset%d’ trick to get Rails to spread the load…
SQL query tuning is one thing that most people neglect. First, make
sure the columns you are using in relationships are indexed. Make sure
Maybe. Maybe not If you’ve got a million users and have a ‘gender’
column don’t index that as roughly half are going to be one and half the
other. I think postgres is smart enough to realize that and ignore your
index, but mysql isn’t. It will use the index and then lookup 500,000
rows and you’ll get worse performance.
Similarly if you have a table that the column isn’t very unique and it’s
constantly being updated the index re-generation overhead will hurt you.
But if you do a lot of lookups on those users by their login and don’t
have login indexed then yeah you’re gonna be hurting
Other associated services. For example, for one client we’re forced to
use SMTP mailer connections. The SMTP server has a slow response to
each request, so there’s a perceptible delay when sending mail. To
compensate we wrote a small service to send mail asynchronously. This
may be true of other issues, like shared file systems, etc.
http://seattlerb.rubyforge.org/ar_mailer/
maybe of use there… not sure if the original questioner has email
issues
or not…
Can anyone recommend some good reading material on scaling a Rails
app? we receive around 5k-7k visitors per day and are running
5k-7k visitors? That could be a little bit of traffic or a lot of
traffic… what’s your actual page requests per second on average? If
each of those visitors only hits one page a day then your scaling
problem
is very different than if they hit 100.
postgres and rails with fastcgi - we have not implemented caching yet
and are pondering moving to mongrel.
I’d definitely recommend switching away from fastcgi. Mongrel with
Nginx.
Or perhaps Litespeed.
Caching will almost certainly help as well. But pick your spots so you
don’t spend time caching things that don’t make any sense. Maybe look
into memcache. If you can page cache, that will be your biggest gain.
Also, postgres can definitely stand to be tuned to your specific
situation. See if you’ve got some slow queries and ask on the postgres
lists for help on tuning.
On Feb 22, 2008, at 1:11 PM, Marc wrote:
Hi Ezra - thanks for the link - can you give any general guidance
while in the mean time - will take a while to order and read your
book
Marc-
So can you expound on what your current pain points are? What kind of
hardware are you currently on? What is the load on the box? is the
database or the fcgi’s taking most of the resources? Are you RAM
constrained or CPU constrained? What kind of peak traffic do you get?
If you can provide a breakdown of your current setup and what is the
bottleneck then I can better help you. But it sounds like you would
benefit from an additional server to put the database on and then
switching the app servers to nginx + mongrel or thin. This helps
toseparate the concerns so you can know whether you need to scale the
database or the application servers.
Give a little more info and we can help figure out the best plan of
attack for you.
Cheers-
- Ezra Z.
– Founder & Software Architect
– [email protected]
– EngineYard.com
Phillip - thank you for your comments - yes our visitors tend to stick
around and browse - here is a sampling of our traffic from Wed. of
this week (via google analytics) :
6,330 Visits
42,607 Pageviews
6.73 Pages/Visit
00:06:53 Avg. Time on Site
55.02% % New Visits
In regards to page request per second - Im not sure how to calc that -
I have data for page views per hour - which gives an average of 30-40
page views per minute - again appreciate your help and advice.
HI Ezra - we just upgraded to a rackspace box with duo core - running
4 gigs of RAM - but when I run TOP - almost 90% of CPU is going to
fcgi and postmaster - Wed. was a bigger day for us - see stats below
in the other response - we have everything on one box - database, app,
web - (bad idea??) thanks for your insight and guidance (I am sure we
will buy your book ;).
Thanks for the quick response - so right now we are running the
database, app and web services all on one server - perhaps time for us
to break out to three servers?
I’d put the database on it’s own and then probably run nginx/mongrel (or
litespeed) on the other two and load balance b/n them.
We have indexed all our databases
but could return to the code and ensure we are being efficient. From
what I have read more people recommend more and more hardware - but I
dont understand the relationship between site activity and processing
power. Our current setup is too slow - and its hurting business. How
hard is it to migrate from fastCGI to mongrel?
Pretty easy. Google around and you’ll find some good tutorials on
setting
up mongrel (and mongrel cluster).
HI Ezra - we just upgraded to a rackspace box with duo core - running
4 gigs of RAM - but when I run TOP - almost 90% of CPU is going to
fcgi and postmaster -
Doesn’t that really just mean those are the only processes doing
something? I know there are some systems that use 100% of the cpu for a
trivial process simply because nothing else wants to run so that process
figures it might as well hog the cpu
But if they are fighting for CPU time then that’s a problem…
If you hit shift-m while in top what does it say is your most ram hungry
processes? Are you hitting the 4gb limit?
I agree - we should be able to handle TONS more - our RAM is MAXED at
4 gigs - the real problem I think then is postgres taking to long to
execute queiries - how can I tell how long to generate a rails apge?
(I think I will post in the postgres groups as well to see if they can
help)
Traffic doesn’t come evenly distributed all day long. Are there times
of the day where the performance is fine? Are there times during the
day when performance sucks?
For example, if you don’t get much traffic early in the morning, and
performance is still a problem, then this isn’t a scalability issue. It
might be a configuration or software issue.
Also, are you running in production mode or development mode?
Could be an IO subsystem problem. When you look at top and the server
is busy what does the %wa say? Also try this command and paste us the
output:
iostat -x 5
What kind of disks are in the server? And with what kind of raid
setup? It sounds to me like you just need to get a separate box for
the database. Keeping the database and the fcgi’s on separate boxes
and tuning the configs properly will allow linux to agressively cache
the stuff you need. With both on the same box they are fighting for
disk io cache.
Also are you using the default postgresql config? The default config
is tuned for 64Mb of ram and needs to be dialed in when you have more
ram.
Cheers-
-Ezra
On Feb 22, 2008, at 1:43 PM, Marc wrote:
stick
that -
you’ve
got free ram I’d switch to mongrel first. Then, if necessary, move
postgresql to another box
- Ezra Z.
– Founder & Software Architect
– [email protected]
– EngineYard.com
I agree - we should be able to handle TONS more - our RAM is MAXED at
4 gigs - the real problem I think then is postgres taking to long to
execute queiries - how can I tell how long to generate a rails apge?
(I think I will post in the postgres groups as well to see if they can
help)
log/production.log should tell you…
In regards to page request per second - Im not sure how to calc that -
I have data for page views per hour - which gives an average of 30-40
page views per minute - again appreciate your help and advice.
Hrm. That’s actually not that much traffic… assuming you can finish
your
requests in under a second. What’s your logs say about how long it’s
taking to generate rails pages?
What’s your memory usage? Lots of free ram? Or maxxed out? If you’ve
got free ram I’d switch to mongrel first. Then, if necessary, move
postgresql to another box.
Ezra - we have talked with you before - and we know you are a rails
guru - one of the best! - thank you for taking time to help (I had to
step away to grab a pastrami sandwich) -
3 x 146 GB (10,000 RPM) SCSI Drives - RAID 5
Top
top - 14:11:08 up 3 days, 23:59, 2 users, load average: 3.03, 2.94,
2.83
Tasks: 180 total, 1 running, 179 sleeping, 0 stopped, 0 zombie
Cpu(s): 30.1% us, 8.9% sy, 0.0% ni, 60.8% id, 0.2% wa, 0.0% hi,
0.0% si
Mem: 4147336k total, 4115540k used, 31796k free, 55668k
buffers
Swap: 1052248k total, 256k used, 1051992k free, 2520152k
cached
iostat -x 5
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
sda 0.36 41.93 3.80 18.28 175.58 481.78 87.79
240.89 29.77 0.21 9.62 0.95 2.10
On 22 Feb 2008, at 22:43, Marc wrote:
I agree - we should be able to handle TONS more - our RAM is MAXED at
4 gigs - the real problem I think then is postgres taking to long to
execute queiries - how can I tell how long to generate a rails apge?
(I think I will post in the postgres groups as well to see if they can
help)
Your production log should give you more info:
Processing ExternalController#playlist [GET]
Rendering external/playlist
Completed in 0.02592 (38 reqs/sec) | Rendering: 0.00662 (25%) | DB:
0.00625 (24%) | 200 OK
Rails Log Analyzer could help you, haven’t used it myself, but it
seems like it could provide you with more information.
http://rails-analyzer.rubyforge.org/
Although I must say the number of views per day you’re getting isn’t
humongous. I have an old horse (compared to your setup) serving a lot
more than that (it has quite a number of apps running on it too). I
do use Apache + Pound (load balancing) + mongrel cluster. I’ve used
Apache+FCGI quite some time ago and learned you should avoid it. I
first switched over to Lighttpd, which improved things a lot and then
to the current setup. We also have a server running Apache load
balancer+mongrel cluster and nginx+mongrel cluster, they all work
very very well.
Best regards
Peter De Berdt
On Feb 22, 2008, at 2:12 PM, Marc wrote:
Tasks: 180 total, 1 running, 179 sleeping, 0 stopped, 0 zombie
avgrq-sz avgqu-sz await svctm %util
sda 0.36 41.93 3.80 18.28 175.58 481.78 87.79
240.89 29.77 0.21 9.62 0.95 2.10
So it does not appear to be an IO problem. Just a busy box. I think
your best course of action here it to get an additional box and move
the fcgi’s over to that box and leave the current box just for the
database. Also make sure your psql config is not the default config
as it is tuned for tiny amounts of memory.
With 2 cpu cores and a load average of 3.x you are not doing too bad.
Basically a load average of 4.0 per cpu core is completely baked. So
if your load average approaches 8.0 then you know your cpus are
completely maxed out.
I think you need another box to separate the db from the app server
so you can get out of the big ball of mud setup and into a setup where
you can tune the db and app servers separately.
Cheers-
- Ezra Z.
– Founder & Software Architect
– [email protected]
– EngineYard.com