Rails, GC and memory eating mongrels -How to pinpoint cause?

We’re running into memory problems with our rails app.
The short version of the story is that we have a controller than
manipulates a fairly hefty dataset (pulled out of our database).
Whichever fastcgi/mongrel handles that request loads into memory this
dataset.
The problem is that once the controller has finished running there does
not appear to be any decrease in the memory footprint of the mongrels.
After the user has been working at this for a short while, most of the
mongrels will have serviced such a request and so each one of these is
carting around ~100 megs of RAM (as given by the ‘resident’ column in
top) instead of the usual ~30 or so, which eventually causes our server
to start swapping like mad.

It doesn’t look like straightforward leak: eventually the memory usage
of each mongrel will top out (it usually takes 2-3 or more requests to
reach this

I’ve narrowed down the problem somewhat, and as a test I have an action
which looks like this (and which has the same behaviour as the real
actions in the application

def test_action
c = ActiveRecord::Base.connection
c.select_all( <<_SQL
SELECT question_group_entries.* FROM question_group_entries
RIGHT JOIN question_group_user_cache_entries
ON question_group_user_cache_entries.outgoing_message_id =
question_group_entries.outgoing_message_id
_SQL
)
end

The details of the query are not important, the significant thing (I
think) is that the the result set is relatively large (~100000 rows
consisting of 3 numbers)

I’ve been poking around with Stephan Kaes’ railsbench, and the GC
related stuff in particular (I have recompiled my ruby interpreter with
the patch he supplies), but I’m not sure what to make of the numbers I’m
getting, or how I should best tweak the GC parameters the patch offers.

In case it matters we’re using MySQL 4.1.12 under Linux on our
production machines (using Apache 1.3 + fastcgi on our production
machine and Apache 2 + 8 mongrels on our test production machine), and
I’m running MySQL 4.1.21 under OS X on my development machine.

Any suggestions as to what we should be looking at would be greatly
appreciated.
Thanks,

Fred

On Wed, 27 Sep 2006 16:45:03 +0200
Frederick C. [email protected] wrote:

We’re running into memory problems with our rails app.
The short version of the story is that we have a controller than
manipulates a fairly hefty dataset (pulled out of our database).
Whichever fastcgi/mongrel handles that request loads into memory this
dataset.

Quick set of suggestions:

  1. Make sure that you aren’t storing the values in a @@ class variable
    or similar long lived area.
  2. Don’t store these objects in the session (or any objects other than
    base types). The session is not faster than the database you’re already
    using and it creates copies.
  3. Run a multiuser test against your application and see if it’s making
    many copies of the same data. If so then either use or write a cache
    library for it that uses weakref so they can be collected, but share the
    data between all the clients. This is really hard to get right.
  4. Mongrel has a simple -B option which will log memory usage to
    log/mongrel_debug/objects.log and might help find out what’s going on.

Hope that helps.


Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu

http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 – Come get help.

Frederick C. wrote:

The details of the query are not important, the significant thing (I
think) is that the the result set is relatively large (~100000 rows
consisting of 3 numbers)

100.000 rows of numbers (actually objects) where each has 20 bytes of
minimum ruby overhead would mean 300.000*20 = 6.000.000 bytes. Possibly
even more since objects may have instance variables. Plus the the memory
allocator has its own overhead too. Also each row is an object in
itself.
Think you got it pretty well pinpointed.

While you could try forcing the GC in an after_filter or even better in
a
before_filter, the best option would be to create fewer objects at Ruby
level.

Zsombor

Company - http://primalgrasp.com
Thoughts - http://deezsombor.blogspot.com

Zed A. Shaw wrote:

On Wed, 27 Sep 2006 16:45:03 +0200
Frederick C. [email protected] wrote:

We’re running into memory problems with our rails app.
The short version of the story is that we have a controller than
manipulates a fairly hefty dataset (pulled out of our database).
Whichever fastcgi/mongrel handles that request loads into memory this
dataset.

Quick set of suggestions:

  1. Make sure that you aren’t storing the values in a @@ class variable
    or similar long lived area.
  2. Don’t store these objects in the session (or any objects other than
    base types). The session is not faster than the database you’re already
    using and it creates copies.
    I’ve already checked my code for stuff like that, and my test case with
    an action consisting solely of 1 query still exhibits the problem
  1. Run a multiuser test against your application and see if it’s making
    many copies of the same data. If so then either use or write a cache
    library for it that uses weakref so they can be collected, but share the
    data between all the clients. This is really hard to get right.
    There is typically only one person using this particular part of the
    app, so (unless I’m not getting what you’re saying) I don’t think this
    will help.
  1. Mongrel has a simple -B option which will log memory usage to
    log/mongrel_debug/objects.log and might help find out what’s going on.

That sounds promising.

Thanks,

Fred

Not sure if this is relevant, but see this:

http://blog.segment7.net/articles/2006/09/13/controlling-rails-process-size

Vish

Dee Z. wrote:

Frederick C. wrote:

The details of the query are not important, the significant thing (I
think) is that the the result set is relatively large (~100000 rows
consisting of 3 numbers)

100.000 rows of numbers (actually objects) where each has 20 bytes of
minimum ruby overhead would mean 300.000*20 = 6.000.000 bytes. Possibly
even more since objects may have instance variables. Plus the the memory
allocator has its own overhead too. Also each row is an object in
itself.
Think you got it pretty well pinpointed.

While you could try forcing the GC in an after_filter or even better in
a
before_filter, the best option would be to create fewer objects at Ruby
level.

I tried forcing a GC to no effect (and using the gc stuff in railsbench
it is clear that GC’ are happening ).

Fred

Vishnu G. wrote:

Not sure if this is relevant, but see this:

http://blog.segment7.net/articles/2006/09/13/controlling-rails-process-size

Vish

Hmm hadn’t thought of that. While it’s a bit icky (and not great if
you’re the user who happens to push the rails process over its limit) it
would solve the problem of the mongrels running the server into the
ground.
I have however realised that I can solve this another way: I think I can
write things so that I never need to load this large object graph in at
all (only bite sized portions of it)

Fred

if the objects you’re referring to are ActiveRecord objects, you could
try
this guy’s paginating find plugin:

ed

We’ve also done work on large datasets. At first I was manipulating
them with ActiveRecord, but the processing time and memory footprint
were just too much for poor Ruby. Since it’s only three values coming
back per row, try making a direct database and manipulating the results
yourself. Though I haven’t done a lot of deep work in the source code,
I do know ActiveRecord does a ton of work to make database objects easy
for you. If you can avoid that work you’ll get a nice performance
increase.

We switched to using Java for the processing we’re doing, though I’ll
most likely move to Python here shortly so it’s easier to modify the
scripts. As an rough benchmark, we were processing 100,000 entries out
of an text file into a single database table. Using Ruby and
ActiveRecord the process took over 48 hours, while in Java using iBatis
it took about three hours. There’s a lot of additional processing
going on with each record, such as checking for updates versus creates
and maintaining PKIDs, so ActiveRecord gets a good work out.

We were already manipulating this data set without using ActiveRecord
(and it was definitely a big help).
My immediate problem has been solved, but I’m still curious as to what
tools should I have on my toolchain when faced with this sort of problem

Fred