Ruby memory usage


#1

Hi Folks,

I’m trying to figure out a memory usage issue with some long running
ruby processes and was wondering if anyone can help me out.

The background is that I have a set of long running ruby processes which
are communicating with a larger system using an activeMQ server via
STOMP. These processes are running for weeks at a time, most of the time
just sitting there waiting for an incoming message.

My team has noticed that these processes memory usage gradually grows
over time. As a concrete example one of my processes starts life using
around ~20MB of virtual memory, and 20 hours later is sitting at 382MB
of virtual memory, 115MB resident.

My first step in trying to figure this out was to use a slightly tweaked
version of the memory profiler described here:
http://scottstuff.net/blog/2006/08/17/memory-leak-profiling-with-rails.
Essentially I’m just taking a snapshot of ObjectSpace every 10 seconds
and dumping a top 20 count of how many instances there are of a class.
Here are two snapshots, one take a few minutes after the process starts,
and one ~20 hours later:

----------Wed May 06 14:03:33 -0700 2009----------
Top 20
23724: String
1121: Hash
776: Class
738: Regexp
578: Array
474: StatsBucket(id: integer, organization_id: integer,
se_account_id: integer, date: date)
474: TZInfo::TimezoneTransitionInfo
374: Proc
236: Module
154: Bignum
152: UnboundMethod
134: Range
62: Rational
57: Float
34: Gem::Version
28: ActiveRecord::ConnectionAdapters::MysqlColumn
27: Time
22: Gem::Version::Requirement
17: Mutex
16: Gem::Specification

----------Thu May 07 10:45:48 -0700 2009----------
Top 20
23848: String
834: Class
790: Bignum
783: Array
738: Regexp
506: Hash
474: TZInfo::TimezoneTransitionInfo
376: Proc
236: Module
159: StatsBucket(id: integer, organization_id: integer,
se_account_id: integer, date: date)
152: UnboundMethod
134: Range
129: Rational
107: Float
104: MatchData
54: Thread
46: Timeout::Error
45: Time
34: Gem::Version
32: ActiveRecord::ConnectionAdapters::MysqlColumn

As you can see there’s not that much of a change in the ObjectSpace, and
yet the memory usage of this same process has blown up from 20MB to over
300MB of virtual memory.

Can anyone give me any clues on what’s going on here? Or any pointers as
to what else I can in terms of measurements and instrumentation to get a
better insight as to what’s going here?

Any help greatly appreciated!

Cheers,
Pete


#2

On Thu, May 7, 2009 at 11:01 AM, Pete H. removed_email_address@domain.invalid
wrote:

My team has noticed that these processes memory usage gradually grows
and one ~20 hours later:
  474: TZInfo::TimezoneTransitionInfo
   22: Gem::Version::Requirement
  738: Regexp
  104: MatchData
300MB of virtual memory.

Can anyone give me any clues on what’s going on here? Or any pointers as
to what else I can in terms of measurements and instrumentation to get a
better insight as to what’s going here?

The first thing that jumps to mind is that the 54 instance of Thread
in the later listing may be a sign of the problem; unless you really
should have that many Threads hanging around, you may be doing
something that is preventing used Thread objects from being garbage
collected. If you are, for instance, launching a new thread when you
get a message and they are hanging around after they aren’t needed,
that could explain the expanding memory usage.


#3

Christopher D. wrote:

On Thu, May 7, 2009 at 11:01 AM, Pete H. removed_email_address@domain.invalid
wrote:

My team has noticed that these processes memory usage gradually grows
and one ~20 hours later:
474: TZInfo::TimezoneTransitionInfo
22: Gem::Version::Requirement
738: Regexp
104: MatchData
300MB of virtual memory.

Can anyone give me any clues on what’s going on here? Or any pointers as
to what else I can in terms of measurements and instrumentation to get a
better insight as to what’s going here?

The first thing that jumps to mind is that the 54 instance of Thread
in the later listing may be a sign of the problem; unless you really
should have that many Threads hanging around, you may be doing
something that is preventing used Thread objects from being garbage
collected. If you are, for instance, launching a new thread when you
get a message and they are hanging around after they aren’t needed,
that could explain the expanding memory usage.

Hmm, you’re right that is strange. I can’t think of anywhere where I am
spawning threads after initial start up, so I’ll look into that more and
see where that leads. Thanks Christopher!


#4

Pete H. wrote:

Christopher D. wrote:

On Thu, May 7, 2009 at 11:01 AM, Pete H. removed_email_address@domain.invalid
wrote:

My team has noticed that these processes memory usage gradually grows

For what it’s worth, I have a pretty complex web app with multiple
dimensional
arrays referencing nested classes and it runs really slow when we have
less than
a gig of memory, which never really made sense to us for the level of
stuff it
was doing. I may just need to spend a lot more time cleaning up my
code, but
seems to me Ruby can eat a bit more memory than one might expect it
should.

xc


#5

On Thu, May 7, 2009 at 12:01 PM, Pete H.
removed_email_address@domain.invalidwrote:

My team has noticed that these processes memory usage gradually grows
over time. As a concrete example one of my processes starts life using
around ~20MB of virtual memory, and 20 hours later is sitting at 382MB
of virtual memory, 115MB resident.

One thing to keep in mind is that MRI’s garbage collector never returns
memory to the system, and doesn’t do a very good job in general. MRI
has a
tendency towards unbounded heap growth over time, especially in
long-running
processes.

You might try JRuby if you don’t have too many dependencies on C
extensions. It uses the advanced garbage collection available in the
JVM.


#6

2009/5/8 Tony A. removed_email_address@domain.invalid:

tendency towards unbounded heap growth over time, especially in long-running
processes.

I believe this has changed with 1.9.1 - at least memory usage seems to
not grow as much as with 1.8.*.

Btw, you can easily construct a memory leak, which cannot necessarily
be detected by counting objects:

ruby19 -e ‘s="."*1024;t="";loop do t << s end’

I.e., you can grow any String which is not frozen to arbitrary lengths.

Btw, Pete, you might get more insight into your application if you
dump deltas of object counts. IIRC we had a sample here in the group
but I cannot find it at the moment.

Kind regards

robert


#7

From: “Pete H.” removed_email_address@domain.invalid

Hmm, you’re right that is strange. I can’t think of anywhere where I am
spawning threads after initial start up, so I’ll look into that more and
see where that leads. Thanks Christopher!

Also, what’s your ruby -v ?

Some versions of 1.8.6 have leaked memory, for instance.

Regards,

Bill


#8

Pete H. wrote:

Hi Folks,

I’m trying to figure out a memory usage issue with some long running
ruby processes and was wondering if anyone can help me out.

If it’s poor GC you could try out the MBARI patches. I had a rails app
that grew by 8MB /request every request until I switched to MBARI–now
it works like a charm. Good luck!
-=r


#9

Robert K. wrote:

Btw, you can easily construct a memory leak, which cannot necessarily
be detected by counting objects:

ruby19 -e ‘s="."*1024;t="";loop do t << s end’

I.e., you can grow any String which is not frozen to arbitrary lengths.

That not really a leak, though, since t is referenced…

Not that this situation isn’t a potential problem, but we should make
sure everyone understands that the blame for it rests squarely on the
programmer and not on ruby’s GC.


#10

On 08.05.2009 18:24, Joel VanderWerf wrote:

Robert K. wrote:

Btw, you can easily construct a memory leak, which cannot necessarily
be detected by counting objects:

ruby19 -e ‘s="."*1024;t="";loop do t << s end’

I.e., you can grow any String which is not frozen to arbitrary lengths.

That not really a leak, though, since t is referenced…

Well, AFAIK the definition of “memory leak” in GC languages is that of
objects kept longer than needed. So if someone creates a program which
simply appends text to a String over and over again where creating new
short strings would be sufficient he has created a memory leak IMHO.

My main point though was to demonstrate that memory usage and object
count do not necessarily correlate.

Not that this situation isn’t a potential problem, but we should make
sure everyone understands that the blame for it rests squarely on the
programmer and not on ruby’s GC.

Yep!

Kind regards

robert


#11

ruby19 -e ‘s="."*1024;t="";loop do t << s end’

I.e., you can grow any String which is not frozen to arbitrary lengths.

That not really a leak, though, since t is referenced…

Not that this situation isn’t a potential problem, but we should make
sure everyone understands that the blame for it rests squarely on the
programmer and not on ruby’s GC.

In this particular case the problem definitely lies with the programmer.
However, in some other instances ruby’s GC really does stink (but
hopefully the MBARI patches overcome its deficiencies, and also Ruby 1.9
should have fewer problems since its threads use distinct thread memory
spaces so can’t share ghost references).
Cheers!
-=r


#12

Pete H. wrote:

Thanks so much for all the input folks, it’s really really appreciated.

ruby -v gives:
ruby 1.8.6 (2008-06-20 patchlevel 230) [i386-linux]

I did some additional tweaks to the memory profiler, and it now dumps a
yaml document with object space details every 10 seconds. I did some
analysis on discovered that:

  • Virtual memory grows in a straight line serially over time
  • Resident memory grows up to a point, then start fluctuating. I’m
    presuming that’s the kernel starting to reclaim resident memory from the
    process as it starts to run out of physical memory.
  • Object counts stay almost entirely flat over time.

At this point I wonder whether the issue is with the MRI GC, as Tony
mentioned. I’m a little leary of upgrading to 1.9 just in order to fix
this, but I might give it a try just to see what the memory usage over
time looks like compared to 1.8.6.

I charted out memory usage over time and object space usage over time to
illustrate what I’m seeing:

http://dl.getdropbox.com/u/796485/mercury_mem_usage.jpg
http://dl.getdropbox.com/u/796485/mercury_object_counts.jpg


#13

Hello,

I charted out memory usage over time and object space usage over time to
illustrate what I’m seeing:

http://dl.getdropbox.com/u/796485/mercury_mem_usage.jpg
http://dl.getdropbox.com/u/796485/mercury_object_counts.jpg

Which type of object is the red curve (between 500 and 1000) that
makes one jump up when memory start raising and begins to behave
erraticaly approximately at the same time resident memory does ?

Cheers,

JJ


#14

On Fri, May 8, 2009 at 2:49 PM, Pete H. removed_email_address@domain.invalid
wrote:

Thanks so much for all the input folks, it’s really really appreciated.

ruby -v gives:
ruby 1.8.6 (2008-06-20 patchlevel 230) [i386-linux]

I did some additional tweaks to the memory profiler, and it now dumps a
yaml document with object space details every 10 seconds.

I think memory profiling using ObjectSpace is fundamentally flawed (a
few
years ago I wrote an ObjectSpace-powered memory profiler and had many
issues
with it)

You might check out bleak_house instead:

http://blog.evanweaver.com/files/doc/fauna/bleak_house/files/README.html

At this point I wonder whether the issue is with the MRI GC, as Tony
mentioned. I’m a little leary of upgrading to 1.9 just in order to fix
this, but I might give it a try just to see what the memory usage over
time looks like compared to 1.8.6.

With JRuby you can have 1.8 compatibility plus your selection of all the
pluggable garbage collectors available in the JVM.


#15

Fleck Jean-Julien wrote:

Hello,

I charted out memory usage over time and object space usage over time to
illustrate what I’m seeing:

http://dl.getdropbox.com/u/796485/mercury_mem_usage.jpg
http://dl.getdropbox.com/u/796485/mercury_object_counts.jpg

Which type of object is the red curve (between 500 and 1000) that
makes one jump up when memory start raising and begins to behave
erraticaly approximately at the same time resident memory does ?

Cheers,

JJ

instances of Array. Not sure why it starts behaving like that at the
same time residential memory does.


#16

Thanks so much for all the input folks, it’s really really appreciated.

ruby -v gives:
ruby 1.8.6 (2008-06-20 patchlevel 230) [i386-linux]

I did some additional tweaks to the memory profiler, and it now dumps a
yaml document with object space details every 10 seconds. I did some
analysis on discovered that:

  • Virtual memory grows in a straight line serially over time
  • Resident memory grows up to a point, then start fluctuating. I’m
    presuming that’s the kernel starting to reclaim resident memory from the
    process as it starts to run out of physical memory.
  • Object counts stay almost entirely flat over time.

At this point I wonder whether the issue is with the MRI GC, as Tony
mentioned. I’m a little leary of upgrading to 1.9 just in order to fix
this, but I might give it a try just to see what the memory usage over
time looks like compared to 1.8.6.


#17

From: “Pete H.” removed_email_address@domain.invalid

ruby -v gives:
ruby 1.8.6 (2008-06-20 patchlevel 230) [i386-linux]

patchlevel 230 was one of the bad ones. It leaks like a sieve.

See [ruby-core] thread:

[Ruby 1.8 - Bug #216] Memory leaks in 1.8.6p230 and p238
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/17571

It affected a whole range of patchlevels:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/17596

Also, a leak of similar severity was introduced later on, starting
in p296 lasting at least through p355:

[ruby-core:22582] Re: [Bug #1223] Memory leak reintroduced in 1.8.6
branch?
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/22852

I’d recommend trying the recent official 1.8.6 release, p368
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/332578

I have not personally tried p368 on my long-running servers, as
I had already found a stable version, p279. So I have not had a
need to upgrade the version. I can say that p279 is very stable
in its memory profile. I’m guessing the official p368 release
probably is too.

Hope this helps,

Bill


#18

Tony A. wrote:

On Fri, May 8, 2009 at 2:49 PM, Pete H. removed_email_address@domain.invalid
wrote:

Thanks so much for all the input folks, it’s really really appreciated.

ruby -v gives:
ruby 1.8.6 (2008-06-20 patchlevel 230) [i386-linux]

I did some additional tweaks to the memory profiler, and it now dumps a
yaml document with object space details every 10 seconds.

I think memory profiling using ObjectSpace is fundamentally flawed (a
few
years ago I wrote an ObjectSpace-powered memory profiler and had many
issues
with it)

You might check out bleak_house instead:

http://blog.evanweaver.com/files/doc/fauna/bleak_house/files/README.html

I’m thinking that is probably gonna have to be the next step. I had a
bad experience trying to set it up in the past and now am a bit gun shy,
but I’m running out of options here…

At this point I wonder whether the issue is with the MRI GC, as Tony
mentioned. I’m a little leary of upgrading to 1.9 just in order to fix
this, but I might give it a try just to see what the memory usage over
time looks like compared to 1.8.6.

With JRuby you can have 1.8 compatibility plus your selection of all the
pluggable garbage collectors available in the JVM.

Thanks, I’ll give JRuby a run on a VM and see how it performs. I’m
guessing that bleak_house and JRuby don’t play together?


#19

Hi,

At Fri, 8 May 2009 12:57:52 +0900,
Tony A. wrote in [ruby-talk:336091]:

One thing to keep in mind is that MRI’s garbage collector never returns
memory to the system, and doesn’t do a very good job in general. MRI has a
tendency towards unbounded heap growth over time, especially in long-running
processes.

You know only pretty old versions.


#20

With JRuby you can have 1.8 compatibility plus your selection of all the
pluggable garbage collectors available in the JVM.

Thanks, I’ll give JRuby a run on a VM and see how it performs. I’m
guessing that bleak_house and JRuby don’t play together?

bleak house is a patched mri to provide quick access to its internals.
As an alternative to 1.9 you could try mbari patches, which are patched
versions of 1.8.7 and 1.8.6
GL!
-=r