Gc doesn't collect?

Any ideas why:

1.times { a = ‘a’*1000};
30.times { GC.start };
print ObjectSpace.each_object{|o| print o}

prints out ‘aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa…’ despite the fact that
the a’s should have been collected?
Thanks! More question to come, most likely.
-R

On Jul 30, 2008, at 17:13 PM, Roger P. wrote:

Any ideas why:

1.times { a = ‘a’*1000};
30.times { GC.start };
print ObjectSpace.each_object{|o| print o}

prints out ‘aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa…’ despite the fact that
the a’s should have been collected?

Ruby’s garbage collector walks the C stack looking for values that
appear to point to the ruby heap. Ruby thinks you still have a
reference to your string because of this.

On Jul 30, 2008, at 17:40 , Eric H. wrote:

Ruby’s garbage collector walks the C stack looking for values that
appear to point to the ruby heap. Ruby thinks you still have a
reference to your string because of this.

well… I think in this case it is because he never dereferenced a, so
it is still a valid live object. {} doesn’t scope variables the same
way as in, say, C.

On 31.07.2008, at 02:13, Roger P. wrote:

Any ideas why:

1.times { a = ‘a’*1000};
30.times { GC.start };
print ObjectSpace.each_object{|o| print o}

prints out ‘aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa…’ despite the fact that
the a’s should have been collected?

In my opinion GC has not collected ‘a’ at the time you run through your
ObjectSpace.
If you wait longer then ‘a’ will be collected.

this one collects ‘a’ on my machine:

$ ruby -e "1.times { a = ‘a’*1000};
32.times { GC.start; sleep(1) };
print ObjectSpace.each_object{|o| print o}
"

regards, Sandor
Szücs

The thing with garbage collection (in most languages, I don’t know about
Ruby specifically) is that it happens when the intepreter/compiler feels
like doing it, not when you tell it to do it. When it “feels like doing
it” could depend on a lot of factors. (I imagine it as being a low
priority child process.)

As far as I know there’s no way to force garbage collection to happen,
although on the face of it this would seem to be a useful facility.

Just to clear up confusion:
I believe that

GC.start ‘forces’ a garbage collection, and that
do…end and
{…} scopes do indeed have their own scope and local variables, as
methods do.

interestingly,

def go
1.times { a = ‘a’*1000};
end
go
30.times { GC.start };
print ObjectSpace.each_object{|o| print o}

yields the same errant results. I might look into it sometime. Very
weird.

Now for some questions:

currently the GC marks live objects then sweeps to find any free
objects–except it doesn’t actually free any objects that are free but
need finalization. It seems to only do finalizations when a user
explicitly calls GC.start, or when the program terminates. Is there a
reason for this ‘deferred_final_list’ activity?

Also is it true that objects marked FL_SINGLETON should never be freed,
even if they are no longer referenced by any live code? Or is
FL_SINGLETON just used as an internal GC marker to mean ‘the heap this
object comes from is entirely free–don’t bother adding it to the
freelist since it is on the chopping block to be free’ed’ and nothing
else?

Thanks!
-R

On Fri, 2008-08-01 at 01:24 +0900, Dave B. wrote:

The thing with garbage collection (in most languages, I don’t know about
Ruby specifically) is that it happens when the intepreter/compiler feels
like doing it, not when you tell it to do it. When it “feels like doing
it” could depend on a lot of factors. (I imagine it as being a low
priority child process.)

As far as I know there’s no way to force garbage collection to happen,
although on the face of it this would seem to be a useful facility.
gc.start … right?

M. Edward (Ed) Borasky
ruby-perspectives.blogspot.com

“A mathematician is a machine for turning coffee into theorems.” –
Alfréd Rényi via Paul Erdős

Roger P. wrote:

Any ideas why:

1.times { a = ‘a’*1000};
30.times { GC.start };
print ObjectSpace.each_object{|o| print o}

I presume you don’t really want both those invocations of “print”?

prints out ‘aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa…’ despite the fact that
the a’s should have been collected?

For the record, I’ve just tested this with Ruby 1.8.7 running on Debian
Lenny (x86-64) and it does not print ‘aaaaaaa…’

I also tried this code:

1.times { a = ‘a’*1000};
puts a

To test the earlier assertion that “a” has not gone out of scope. This
produces a run-time error because it has in fact gone out of scope so
Roger seems right in expecting it to have been garbage collected.

I’ve also tested it with and without the explicit call on GC.start.
Without the call the object is still there. With just a single call (30
calls not needed) the object is gone.

John

On Jul 31, 2008, at 22:18 PM, Roger P. wrote:

Just to clear up confusion:
I believe that

GC.start ‘forces’ a garbage collection

Well, if there’s no garbage then there’s no collection.

, and that
do…end and
{…} scopes do indeed have their own scope and local variables, as
methods do.

Ruby scope, yes, but that doesn’t mean the C stack has no pointers to
your object. There’s no guarantee that all references your object
have been clobbered by subsequent calls. (Or that there are values on
the C stack that look like pointers to your objects.)

weird.
This is simply how ruby’s conservative collector works.

Now for some questions:

currently the GC marks live objects then sweeps to find any free
objects–except it doesn’t actually free any objects that are free but
need finalization.

I think you found a bug. Ruby 1.6 called finalizers after sweep, but
1.8.6 doesn’t.

$ cat final.rb
$finalizer_proc = proc do |obj_id| puts “#{obj_id} finalized” end

def a() b end
def b() c end
def c() d end
def d() e end
def e() f end
def f() g end
def g() h end
def h() i end
def i() j end
def j() k end
def k() make_obj end

def make_obj
o = Object.new
ObjectSpace.define_finalizer o, $finalizer_proc
o.id
end

obj_id = a

puts “#{obj_id} created”

a = []
s = ‘a’

begin
loop do
ObjectSpace._id2ref obj_id
a << s.succ!
print “#{s}\r”
end
rescue RangeError
puts
puts “#{obj_id} collected”
end

$ ruby -v final.rb
ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]
81660 created
avr
81660 collected
81660 finalized
$ ruby16 -v final.rb
ruby 1.6.8 (2005-09-21) [i386-darwin9.4.0]
1117686 created
1117686 finalized
omu
1117686 collected
$

It seems to only do finalizations when a user
explicitly calls GC.start, or when the program terminates. Is there a
reason for this ‘deferred_final_list’ activity?

This patch seems to restore 1.6 behavior:

$ svn diff gc.c
Index: gc.c

— gc.c (revision 18230)
+++ gc.c (working copy)
@@ -1196,7 +1196,7 @@ gc_sweep()

  /* clear finalization list */
  if (final_list) {
  • deferred_final_list = final_list;
  • finalize_list(final_list);
    return;
    }
    free_unused_heaps();
    $ ./miniruby -I./lib -I.ext/common -I./- -r./ext/purelib.rb ./
    runruby.rb --extout=.ext – ~/final.rb
    605300 created
    605300 finalized
    rek
    605300 collected

I’m not sure if it was accidentally removed or not, the log for r7090
points to several segmentation faults due to evil things done with
finalizers and threads:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-dev/24536

Also is it true that objects marked FL_SINGLETON should never be
freed,
even if they are no longer referenced by any live code? Or is
FL_SINGLETON just used as an internal GC marker to mean ‘the heap this
object comes from is entirely free–don’t bother adding it to the
freelist since it is on the chopping block to be free’ed’ and nothing
else?

I’m not sure about this. It was added in the same changeset as above.

I’ll make a pointer to this thread on ruby-core.

On 1 Aug., 09:08, Eric H. [email protected] wrote:

omu
1117686 collected
$

It seems to only do finalizations when a user
explicitly calls GC.start, or when the program terminates. Is there a
reason for this ‘deferred_final_list’ activity?

This patch seems to restore 1.6 behavior:

But does it also make finalization happen before program exit? (This
was Roger’s main point IIRC.) There was probably a good reason why
the order was reversed namely to make sure that objects were gone
before invoking their finalizers. Actually this is how it is defined,
i.e. the finalizer is called after the object has vanished (see
Pickaxe for example).

  •   finalize_list(final_list);
      return;
    }
    free_unused_heaps();
    

Cheers

robert

On Aug 1, 2008, at 01:34 AM, Robert K. wrote:

1117686 finalized

But does it also make finalization happen before program exit? (This
was Roger’s main point IIRC.)

Yes, but the suggested patch was not correct. See [ruby-core:18050]
for the proper patch. The problem was that versions of 1.8 never ran
finalizers unless you called GC.start or were exiting.

There was probably a good reason why
the order was reversed namely to make sure that objects were gone
before invoking their finalizers.

No, finalizers were never called before collection. It looks like it
was a simple oversight while fixing various SEGV bugs when doing evil
things to ruby.

Actually this is how it is defined,
i.e. the finalizer is called after the object has vanished (see
Pickaxe for example).

There was no code for running finalizers, except at exit or when
calling GC.start.

John W. wrote:

For the record, I’ve just tested this with Ruby 1.8.7 running on Debian
Lenny (x86-64) and it does not print ‘aaaaaaa…’

Interesting. Maybe there’s a difference among versions. For me it has
the resultant ‘odd’ behavior in Ubuntu with Ruby 1.8.6 patchlevel 111,
os x, and windows mingw (all 32-bit) and ruby 1.8.5 [x86_64-linux].
Perhaps the 64-bit aspect is clearing the false positives.

I also tested the following code on those same platforms, all running
1.8.6:

def go
1.times { a = ‘a’*1000};
end

go
def recurse b
recurse b
end

begin
recurse 33 # an attempt to “clear the stack”
rescue => e
print “rescued #{e}”
end

30.times { GC.start };
ObjectSpace.each_object{|o| print o}

This still showed the miscreant a’s in ubuntu 32-bit, x86_64 Linux, and
windows but not in OS X [it actually worked there]. FWIW.
I guess this corroborates the theory that it’s a false positive but I’m
still uneasy about it. I guess it’s not a huge problem, but still
somewhat disconcerting.

Some questions:

Currently it appears that if there is a freeable objects that wants
finalization within a page that can be freed, it doesn’t add that that
page to the freelist…I can’t tell from the code, however, whether that
page is basically ‘pinned’ “forever” or not, when that happens. And
with ruby’s current GC, it seems almost impossible to reclaim memory, so
I don’t even know how to test this [the test case being that there are
finalizable objects within a page that wants to be freed–does that page
ever get freed eventually?]. FL_SINGLETON seems to play some role I’m
not sure what.

In other news, please accept my apologies–it appears that
rb_gc_finalize_deferred IS called by eval0 “every 256 eval0’s” I’m not
sure if that is optimal or not, or even a good idea, but at least it
gets called.

-R

On 1 Aug., 07:18, Roger P. [email protected] wrote:

Just to clear up confusion:
I believe that

GC.start ‘forces’ a garbage collection,

Yes, if forces a GC run. But I would not be so sure about whether it
forces actual collection of all collectible instances. In other
words: the GC is run but if it decides that there’s nothing to collect
yet, I won’t collect anything even if there were objects that could be
freed.

and that
do…end and
{…} scopes do indeed have their own scope and local variables, as
methods do.

Yes.

interestingly,

def go
1.times { a = ‘a’*1000};

You do not need the block here as your method provides one already.

currently the GC marks live objects then sweeps to find any free
objects–except it doesn’t actually free any objects that are free but
need finalization. It seems to only do finalizations when a user
explicitly calls GC.start, or when the program terminates. Is there a
reason for this ‘deferred_final_list’ activity?

Which code did lead you to this conclusion? I am asking because
finalizers are sometimes hard to get right. For example, you cannot
define a finalizer with a block inside an instance method because the
block will hold on to self and thus prevent collection. In that case
you will see finalization only happen at program exit.

Kind regards

robert