Debugging memory use and GC

What is a good way to find out what objects are not being GC’d ? I am
seeing a strange pattern I can’t figure out. The app is handling
large files and will use up to 150mb or so of memory and then when I
call GC.start it goes back down to around 8mb. But after a few cycles
memory stops being reclaimed.

Chris

snacktime wrote:

What is a good way to find out what objects are not being GC’d ? I am
seeing a strange pattern I can’t figure out. The app is handling
large files and will use up to 150mb or so of memory and then when I
call GC.start it goes back down to around 8mb. But after a few cycles
memory stops being reclaimed.

Debug, set a breakpoint after GC when you expect the anomaly to occur,
inspect ObjectSpace?

David V.

On 10/25/06, snacktime [email protected] wrote:

What is a good way to find out what objects are not being GC’d ? I am
seeing a strange pattern I can’t figure out. The app is handling
large files and will use up to 150mb or so of memory and then when I
call GC.start it goes back down to around 8mb. But after a few cycles
memory stops being reclaimed.

Of course, what you really want to know is not just what’s not getting
GCed but WHY.

This can be a difficult problem. You really want to find the
reference paths from root objects.

Some GC languages like Smalltalk have methods in object to get a list
of everything which references it. I haven’t seen such a facility in
Ruby. Then the problem with this is that calling this method
generates additional references to the objects referencing the object
etc. This kind of heisenberg effect makes building a tool to find
reference paths difficult.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On 10/26/06, Darshan P. [email protected] wrote:

If you suspect some objects, add a finalizer using
ObjectSpace#add_finalizer and put some trace in it.

Of course adding a finalizer won’t be much help in debugging why an
object is not being GCed, since the finalizer will never be invoked.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On Fri, 27 Oct 2006, Rick DeNatale wrote:

This can be a difficult problem. You really want to find the
reference paths from root objects.

Some GC languages like Smalltalk have methods in object to get a list
of everything which references it. I haven’t seen such a facility in
Ruby. Then the problem with this is that calling this method
generates additional references to the objects referencing the object
etc. This kind of heisenberg effect makes building a tool to find
reference paths difficult.

it would be expensive, but i wonder of dumping the objects in
objectspace
might be useful - since Marshal.dump already follows all references it
seems
like a custom _dump method on object which could all themselves to a
tree
might do the trick. in otherwords, if you dumped an object with a
global tree
in contect then all objects being dumped as a result would add
themselves to
this tree. after the dump, you simply keep a copy of the tree…

just a thought…

-a

snacktime wrote:

What is a good way to find out what objects are not being GC’d ? I am
seeing a strange pattern I can’t figure out. The app is handling
large files and will use up to 150mb or so of memory and then when I
call GC.start it goes back down to around 8mb. But after a few cycles
memory stops being reclaimed.

Chris

Print out all Objects to a file before the leak and after the leak.
diff the files. Sort the objects by their class and then the object_id.
If you suspect some objects, add a finalizer using
ObjectSpace#add_finalizer and put some trace in it.


Darshan P.

“The trouble with work is that it interferes with living.” - Peter
Mckill 1968

http://scattrbrain.com

On 10/26/06, [email protected] [email protected] wrote:

it would be expensive, but i wonder of dumping the objects in objectspace
might be useful - since Marshal.dump already follows all references it seems
like a custom _dump method on object which could all themselves to a tree
might do the trick. in otherwords, if you dumped an object with a global tree
in contect then all objects being dumped as a result would add themselves to
this tree. after the dump, you simply keep a copy of the tree…

Not sure, what I was suggesting was that the real goal is to somehow
root out the reference path or path which is keeping an object from
being reclaimed without making additional references.

Another issue is that ObjectSpace.each_object can give you objects
which aren’t really alive:

ick@frodo:/public/rubyscripts$ cat gctest.rb
class Foo
def initialize
@iv = “bar”
end
end

def make_foo
p Foo.new
end

GC.enable

make_foo

ObjectSpace.each_object {|f| p f if Foo === f }
ObjectSpace.garbage_collect
puts “after gc”
ObjectSpace.each_object {|f| p f if Foo === f }
puts “done”
rick@frodo:/public/rubyscripts$ ruby gctest.rb
#<Foo:0xb7dc1804 @iv=“bar”>
#<Foo:0xb7dc1804 @iv=“bar”>
after gc
#<Foo:0xb7dc1804 @iv=“bar”>
done

I’ve played around with various versions of this, like
each_object(Foo) and that instance of Foo with no apparent references
to it seems to be sticking around for some reason.

I instantiated Foo in the make_foo method to make sure that it wasn’t
still in the current stack frame.

This really goes to show that the guarantee that the GC makes is not
to free live objects, and not to free dead ones ASAP.

It also shows why you shouldn’t rely on finalization as part of
application/system logic, since you never know when, or even if it
will be called.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Rick DeNatale wrote:

This can be a difficult problem. You really want to find the
reference paths from root objects.

Some GC languages like Smalltalk have methods in object to get a list
of everything which references it. I haven’t seen such a facility in
Ruby. Then the problem with this is that calling this method
generates additional references to the objects referencing the object
etc. This kind of heisenberg effect makes building a tool to find
reference paths difficult.

I wrote a patch for ruby 1.6/1.7 that would search for all ways of
reaching an object from root objects in objectspace:

http://redshift.sourceforge.net/debugging-GC/

Usage info at:

http://redshift.sourceforge.net/debugging-GC/gc-patch.txt

Nobu updated it for CVS as of 12 Aug 2005:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/151854

I’ve found this to be useful only once or twice, but in those rare cases
it can be very helpful…

On Fri, 27 Oct 2006, Rick DeNatale wrote:

Not sure, what I was suggesting was that the real goal is to somehow root
out the reference path or path which is keeping an object from being
reclaimed without making additional references.

right. i was thinking of something like this:

 harp:~ > cat a.rb
 def really_stupid_reference_finder obj
   begin
     class << obj
       def _dump(*_) throw 'referer', true end
     end
   rescue TypeError
     nil
   end
   ObjectSpace.each_object do |candidate|
     next if candidate == obj
     referer = catch 'referer' do
       begin
         Marshal.dump candidate
       rescue TypeError
         false
       end
       false
     end
     return candidate if referer
   end
   return nil
 ensure
   GC.start
 end



 a = [b = '42']

 referer = really_stupid_reference_finder b
 p referer
 p referer == a

 referer = really_stupid_reference_finder [ 'new_array' ]
 p referer
 p referer == a



 harp:~ > ruby a.rb
 ["42"]
 true
 nil
 false

-a

[email protected] wrote:

def really_stupid_reference_finder obj

That’s a nice idea!

One problem is that a referrer can be something other than an object: a
ruby global var, a C global var, a local var. Or it can be an object
that is not dumpable, such as a proc binding.

But the throw/dump combo is a great trick to remember…

What I’d really like to see is a general object graph traversal
mechanism that can be used to help implement marshal and other dumpers,
gc tools, etc. Several (3 or 4) years ago, matz said he was moving in
this direction…[1]


[1] See:

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi/ruby/marshal.c?cvsroot=src

and grep for vjoel:

  • marshal.c (w_object): T_DATA process patch from Joel VanderWerf
    [email protected]. This is temporary hack; it remains
    undocumented, and it will be removed when marshaling is
    re-designed.

The hack is still there (as of 1.8.5, anyway), still undocumented, and
still useful.

The original discussion about why it is useful starts at:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/34037

Joel VanderWerf wrote:

What I’d really like to see is a general object graph traversal
mechanism that can be used to help implement marshal and other dumpers,
gc tools, etc. Several (3 or 4) years ago, matz said he was moving in
this direction…[1]

This is the thread where matz said he was looking at a more general
traversal mechanism to support marshal and other purposes:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/34335

Maybe it is still “vapor”…

On Fri, 27 Oct 2006, Joel VanderWerf wrote:

undocumented, and it will be removed when marshaling is
re-designed.

The hack is still there (as of 1.8.5, anyway), still undocumented, and still
useful.

The original discussion about why it is useful starts at:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/34037

now that is good know!

cheers.

-a