Refactoring cpu-intensive actions

Gammons · January 11, 2008, 6:59pm

Hey Guys,

I wanted to get some general opinions about a particularly cpu-
intensive action I have that generates XML to be fed into a flash-
based graph.

The data collected for the graph is per-user and is already cached as
much as possible. We use a combination of cache_fu, a 5 minute ttl
fragment cache for the action itself, and model method caching
(partially cache_fu, partially homegrown).

The problem is that when there is a cache miss and the user has a
large amount of data to process, ruby just chugs for a long time…
possibly 30 seconds or even longer (until a mongrel timeout finally
occurs.), so this action is obviously ripe for a refactor.

I decided to replicate the same action as a java class, and achieved a
speedup of roughly 6-7x (5-6 seconds to generate the xml needed as
opposed to about 35 seconds for ruby).

Another idea I had was to create a mysql stored procedure to gather
all the data needed, which can then be synthesized into xml on the
ruby side (this may be preferable since I won’t have to introduce java
into our production mix.

I welcome any thoughts/ideas!

Gammons · January 11, 2008, 7:06pm

On 11 Jan 2008, at 17:58, Gammons wrote:

(partially cache_fu, partially homegrown).
Another idea I had was to create a mysql stored procedure to gather
all the data needed, which can then be synthesized into xml on the
ruby side (this may be preferable since I won’t have to introduce java
into our production mix.

Other approaches you might try include RubyInline/a ruby extension
(especially if the bottleneck is actually quite a small amount of code).
A somewhat orthogonal approach is to use something like backgroundrb
to decouple the processing from the http requests, so that you don’t
hog mongrels for too long.

Fred.

Gammons · January 11, 2008, 7:13pm

Thanks fred, I actually have never considered using RubyInline… I
will check it out!

Unfortunately I can’t use BackgroundRB as the data needs to be
generated in “real time”. As a side, we have experimented with
backgroundrb and concluded it’s a pretty poor offering. We found that
using Amazon SQS + ActiveMessaging + separate server is a much better
approach to backgrounding intensive tasks that don’t need to be
completed in real time. But that’s another topic!

Thanks again,
Grant

On Jan 11, 1:06 pm, Frederick C. [email protected]