Forum: Ruby Ruby GC question (MRI, JRuby, etc)

Posted by Chuck Remes (cremes)
on 2010-08-17 19:26
(Received via mailing list)
My basic understanding of the garbage collectors in use by the various 
Ruby runtimes is that they all search for objects from a "root" memory 
object. If an object cannot be reached from this root, then it is 
collected.

Here's a snippet of ruby code. I'm not sure how the GC will treat it.

class Foo
  def initialize
    @baz = Baz.new
    @quxxo = Quxxo.new
  end
end

class Bar
  def run
    Foo.new
    nil
  end
end

bar = Bar.new
bar.run
bar.run
bar.run


What happens to the instances of Foo created in the call to #run? Since 
I am not saving them  somewhere (e.g. to an array), do they get 
collected right away?

If the Foo instances get collected, is it safe to assume the Baz and 
Quxxo instances are being collected at the same time? Does their 
existence prevent the Foo instance from being collected?

cr
Posted by Kirk Haines (Guest)
on 2010-08-17 19:42
(Received via mailing list)
On Tue, Aug 17, 2010 at 11:19 AM, Chuck Remes <cremes.devlist@mac.com> 
wrote:
> My basic understanding of the garbage collectors in use by the various Ruby runtimes is that they all search for objects from a "root" memory object. If an object cannot be reached from this root, then it is collected.

It depends on the Ruby. JRuby and Rubinius have different garbage
collectors than MRI Ruby.

> What happens to the instances of Foo created in the call to #run? Since I am not saving them  somewhere (e.g. to an array), do they get collected right away?

Nothing is ever collected right away in the MRI rubies currently.  The
object will exist in memory until a GC cycle runs. Unless a GC cycle
is started manually (GC.start), GC cycles only run when Ruby runs
short on preallocated memory.  Take a look at this
http://www.engineyard.com/blog/2010/mri-memory-allocation-a-primer-for-developers/
or google on the subject and you'll find a number of articles that
will explain how it works in more detail than you will get in an
email.

> If the Foo instances get collected, is it safe to assume the Baz and Quxxo instances are being collected at the same time? Does their existence prevent the Foo instance from being collected?

It depends on why you are assuming it. If you have an implementation
that depends on specific garbage collection behaviors or collection in
specific chronologies in order to work right, it is probably not safe
to assume anything. If you are just trying to understand the memory
behavior of your code, and make sure you aren't doing dumb things that
can lead to a memory leak, then yes, it is safe to assume that the Baz
and Quxxo instances will be collected along with the Foo.


Kirk Haines
Posted by Chuck Remes (cremes)
on 2010-08-17 19:58
(Received via mailing list)
On Aug 17, 2010, at 12:42 PM, Kirk Haines wrote:

> is started manually (GC.start), GC cycles only run when Ruby runs
> specific chronologies in order to work right, it is probably not safe
> to assume anything. If you are just trying to understand the memory
> behavior of your code, and make sure you aren't doing dumb things that
> can lead to a memory leak, then yes, it is safe to assume that the Baz
> and Quxxo instances will be collected along with the Foo.

Kirk,

thanks for the pointer to your write-up at engineyard. I'll be sure to 
read through it.

In the meantime, it looks like I need to save my Foo instances to an 
array or something similar if I want to make sure that they do NOT get 
collected until I'm ready.

cr
Posted by Brian Candler (candlerb)
on 2010-08-18 11:44
Chuck Remes wrote:
> In the meantime, it looks like I need to save my Foo instances to an 
> array or something similar if I want to make sure that they do NOT get 
> collected until I'm ready.

In any case, if you don't keep a reference to them somewhere, then you 
can never call any method on them, so the objects are obviously useless 
(which is why they are garbage-collected in the first place)
Posted by Chuck Remes (cremes)
on 2010-08-18 14:38
(Received via mailing list)
On Aug 18, 2010, at 4:44 AM, Brian Candler wrote:

> Chuck Remes wrote:
>> In the meantime, it looks like I need to save my Foo instances to an 
>> array or something similar if I want to make sure that they do NOT get 
>> collected until I'm ready.
> 
> In any case, if you don't keep a reference to them somewhere, then you 
> can never call any method on them, so the objects are obviously useless 
> (which is why they are garbage-collected in the first place)

Not necessarily true. The Bar class in my example could have its own 
internal lifecycle where it is generating events for Baz and Quxxo which 
in turn are reacting to or generating events for Bar. Plus, they all may 
be interacting with yet more objects on a local or remote system. 
Retaining a reference to the Bar instance from Foo does not preclude 
them from doing useful work.

cr
Posted by Brian Candler (candlerb)
on 2010-08-18 15:27
Chuck Remes wrote:
>> In any case, if you don't keep a reference to them somewhere, then you 
>> can never call any method on them, so the objects are obviously useless 
>> (which is why they are garbage-collected in the first place)
> 
> Not necessarily true. The Bar class in my example could have its own 
> internal lifecycle where it is generating events for Baz and Quxxo

Not unless it is running in its own thread. In that case, there will be 
a reference to the object held within the thread - for example in a 
local variable.

However, if the object exists solely to be shared by DRb, then yes you 
will need to keep a handle to it to stop it being garbage-collected. 
That's because DRb uses _id2ref to locate objects via just their id.
Posted by Robert Klemme (Guest)
on 2010-08-18 15:59
(Received via mailing list)
2010/8/17 Chuck Remes <cremes.devlist@mac.com>:
> My basic understanding of the garbage collectors in use by the various Ruby runtimes is
> that they all search for objects from a "root" memory object. If an object cannot be reached
> from this root, then it is collected.

There is a small error in the wording above.  While the issue has been
explained already I want to stress this point because this is a
mistake many new to GC make and it explains some weird effects that
special tests show.  It should have read

If an object cannot be reached from this root, then it _can be_ 
collected.

Small change, big difference. :-)

Kind regards

robert
Posted by Chuck Remes (cremes)
on 2010-08-18 16:23
(Received via mailing list)
On Aug 18, 2010, at 8:58 AM, Robert Klemme wrote:

> If an object cannot be reached from this root, then it _can be_ collected.
> 
> Small change, big difference. :-)

Ha! Yes, quite true. Depending upon the GC algo in use, some objects may 
never be collected even though they are eligible for collection.

cr
Posted by Jörg W Mittag (Guest)
on 2010-08-21 01:23
(Received via mailing list)
Chuck Remes wrote:
>> If an object cannot be reached from this root, then it _can be_ collected.
>> 
>> Small change, big difference. :-)
> Ha! Yes, quite true. Depending upon the GC algo in use, some objects
> may never be collected even though they are eligible for
> collection.

In particular, the GC algorithms in MRI and YARV are specifically
designed with the assumption that they will never actually run in
99.999% of all cases. They are designed for scripting, where a script
doesn't even allocate enough memory to trigger a collection, runs for
a couple of seconds and then exits, after which the OS simply reclaims
the memory: no GC needed.

That's why YARV and especially MRI are so exceptionally bad for server
loads. It's also why REE can never be merged into mainline.

Unless *specifically* guaranteed by the language specification, you
simply cannot make any assumptions about when or even if objects get
collected. Not even Python makes such guarantees, popular myths
notwithstanding.

jwm
Posted by Yukihiro Matsumoto (Guest)
on 2010-08-21 02:15
(Received via mailing list)
Hi,

In message "Re: Ruby GC question (MRI, JRuby, etc)"
    on Sat, 21 Aug 2010 08:15:15 +0900, Jörg W Mittag 
<JoergWMittag+Ruby@GoogleMail.Com> writes:

|In particular, the GC algorithms in MRI and YARV are specifically
|designed with the assumption that they will never actually run in
|99.999% of all cases. They are designed for scripting, where a script
|doesn't even allocate enough memory to trigger a collection, runs for
|a couple of seconds and then exits, after which the OS simply reclaims
|the memory: no GC needed.

99.999% is a bit over-exaggerated, but it is true that garbage
collection algorithm of YARV and MRI focus for throughput on
non-memory extensive short-running programs, and GC of REE is not
suitable for those programs.

              matz.
Posted by Charles Nutter (headius)
on 2010-08-21 02:17
(Received via mailing list)
On Tue, Aug 17, 2010 at 12:19 PM, Chuck Remes <cremes.devlist@mac.com> 
wrote:
> My basic understanding of the garbage collectors in use by the various Ruby runtimes is that they all search for objects from a "root" memory object. If an object cannot be reached from this root, then it is collected.
>
> Here's a snippet of ruby code. I'm not sure how the GC will treat it.

...

> What happens to the instances of Foo created in the call to #run? Since I am not saving them  somewhere (e.g. to an array), do they get collected right away?

Not right away on any impl; they're allocated on the heap, so even
though they're immediately abandoned they still require GC to run.

> If the Foo instances get collected, is it safe to assume the Baz and Quxxo instances are being collected at the same time? Does their existence prevent the Foo instance from being collected?

They would not; no external references to the Foo instance ever exist
on the heap or on the stack.

As far as how this behaves in JRuby: since the object is short-lived,
it would never make it out of the "eden" space on the heap, and with
the JVM's GC that means it would basically have no GC cost (young
objects that don't survive even a single GC cycle are practically
free). The only cost you'd be paying would be the allocation and .new
costs.

- Charlie
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.