Rails and CPU usage

I’ve had an app deployed for about six months now on Dreamhost and I’ve
recently been paying attention to the daily CPU statistics my host
provides. Since it’s a shared hosting environment there’s a
semi-arbitrary amount of CPU minutes I am allowed to use before I’m told
to pay for better hosting or find a better host. I’ve been kind of
worried lately since my site has become more popular and as result, my
CPU minutes are up. Here’s an excerpt from the log for yesterday:

Process CPU seconds user machine count average
dispatch.fcgi 1380.0600 90.150% 5.750% 229 6.026

Now based on Google Analytics that’s resultant from just under 900
unique visitors and 3,222 pageviews. So when I do the math it’s roughly
0.4CPU seconds per pageview.

I’m really not sure when my host will have a problem; I’ve read that
~50-60 CPU minutes is sort of their limit (even though they claim they
have none). I am at less than half that but I’m still fairly worried.

I’m just wondering if this is typical of Rails. I have tried to
consider what processes in my application might be CPU heavy but I can’t
think of anything. The most intensive process I can think of would be
thumbnailing uploaded images, which is really just a call directly to
ImageMagick.

My application is something like a forum (minus user registration) and
is currently about 950 lines of code.

I have to admit that, by and large, Rails is near the top of the
CPU-usage chart of web frameworks. There is a lot of stuff that goes
on behind the scenes, and a lot of stuff which I see in 2.0 they may
have made into plugins.

Also, your CPU chart will probably not show the CPU time spent in
database queries, only handling them once they return. I have noticed
a rather huge spike in the database engine on certain queries. I use
PostgreSQL, and adding indexes helped a lot there. PostgreSQL is
pretty smart when it comes to using an index, and will only use them
when it feels there is both sufficient data in the table and you are
selecting on rows it has an index for. In short, when PostgreSQL
takes a lot of CPU, I find it’s my fault. :slight_smile:

Page or fragment caching will also help a lot. If you can cache a
fragment of code and be smart about when you re-render it, your CPU
time should drastically decrease. For instance, I have (finally)
started adding caching to my code, and currently cache purely “static”
content, like a welcome page, or an about page. Granted, these are
pretty easy pages to render to start with, but I had to start
somewhere, and it was the easiest place.

–Michael

they’re displayed is conditional. I can only think to code my own
caching system, but that’s kind of outside of my knowledge. The only
solution I’ve found is memcached. The problem is memcached it
recommended only for massive sites.

If it works for you though, use it. I don’t know your data set, but
let’s
say it’s under 5megs (which seems small enough your isp probably
wouldn’t
care). If by putting that all into memcache using keys that make sense
to
your app saves you from having to regenerate everything over and over,
why
not use it?

Although conversely it wouldn’t be too hard to implement Cache.get,
Cache.set, Cache.delete to work via the filesystem as well. Or the
database.

You can save a lot of cpu by avoiding rendering the templates…

Perhaps you could also change the url structure to cache by
user/conditions or something as well?

I will have to do more research on memcached. The presentation I looked
at for it said it was only used for very large websites with thousands
of database rows, such as twitter. I figured that on my shared hosting
environment, employing it might do more harm than good, but I honestly
don’t know for sure either way.

Well, it’s good that it might not be my code slowing everything down,
but at the same time bad that I can’t directly fix it.

I guess if my site were commercial I could just “throw more hardware at
it” but when it’s not for profit then that idea doesn’t work. At this
point I’m making enough from ad revenue to maybe afford a VPS like
slicehost.

I’ve looked at caching but can’t really find anything that suits my
needs. The records I’m pulling are the same for every user, but the way
they’re displayed is conditional. I can only think to code my own
caching system, but that’s kind of outside of my knowledge. The only
solution I’ve found is memcached. The problem is memcached it
recommended only for massive sites.