Threads preventing garbage collection?

I wondered if someone can explain the following behaviour to me.

For some reason, in the following code the instances of object Foo are
prevented from being garbage collected.

def make_thread
Thread.new { sleep 2 }
end

class Foo
attr_accessor :bar
end

10.times { |i|
f = Foo.new
f.bar = make_thread
}

GC.start
sleep 1
GC.start

ObjectSpace.each_object(Foo) { |o| p o }

But if I modify the code as follows, then the Foo objects are
garbage-collected just fine:

— tst.rb 2008-09-17 10:00:27.000000000 +0100
+++ tst2.rb 2008-09-17 10:03:04.000000000 +0100
@@ -6,9 +6,10 @@
attr_accessor :bar
end

  • threads = (0…9).collect { make_thread }
    10.times { |i|
    f = Foo.new
  • f.bar = make_thread
  • f.bar = threads[i]
    }

GC.start

It’s as if the thread created by Thread.new keeps a reference to the Foo
object which existed at the time, even though inside make_thread it is
out of scope, so it shouldn’t even know about it.

By tweaking the ‘sleep’ values, it also seems that when the thread
terminates it then does permit the Foo instance to be garbage-collected.

Any ideas as to what’s going on? I get the same results on these two
versions of Ruby:

ruby 1.8.4 (2005-12-24) [i486-linux]
ruby 1.8.6 (2008-03-03 patchlevel 114) [i686-linux]

Thanks,

Brian.

2008/9/17 Brian C. [email protected]:

I wondered if someone can explain the following behaviour to me.

There is no guarantee that all collectible objects are indeed
collected during a GC run. GC is not really deterministic - at least
from the user’s point of view.

For some reason, in the following code the instances of object Foo are
prevented from being garbage collected.

No, they are not collected. This is something different!

Kind regards

robert

Robert K. wrote:

For some reason, in the following code the instances of object Foo are
prevented from being garbage collected.

No, they are not collected. This is something different!

OK, please consider my question rephrased as “why are these objects not
collected?”

Let me increase the number of objects:

def make_thread
Thread.new { sleep 10000 }
end

class Foo
attr_accessor :bar
end

10000.times { |i|
f = Foo.new
f.bar = make_thread
}

GC.start
sleep 1
GC.start

count = 0
ObjectSpace.each_object(Foo) { |o| count += 1 }
puts “#{count} objects”

On my machine, this program uses about 150MB of RAM, and all 10,000
objects are remaining at the end of the run.

Have I done something here which is “wrong”? Are the known pitfalls of
the garbage collector documented anywhere, or ways to write programs so
as to avoid them? Or can it simply not be relied upon, ever?

Regards,

Brian.

Your version only works because the threads are dying after 2 seconds.
Change

def make_thread
Thread.new { sleep 2 }
end

to

def make_thread
Thread.new { sleep 10000 }
end

and run it again.

The big arrays are happily garbage-collected; the Foos are not.

2008/9/17 Brian C. [email protected]:

f.bar = make_thread
On my machine, this program uses about 150MB of RAM, and all 10,000
objects are remaining at the end of the run.

Have I done something here which is “wrong”? Are the known pitfalls of
the garbage collector documented anywhere, or ways to write programs so
as to avoid them?

You probably just did not allocate enough new objects to make the GC
collect stuff. AFAIK there is also a minimal process size under which
no GC occurs.

Or can it simply not be relied upon, ever?

Yes, it can. Try this

def make_thread
Thread.new { sleep 2 }
end

class Module
def count
c = 0
ObjectSpace.each_object(self) { c += 1 }
c
end
end

Foo = Struct.new :bar

1000.times {
Foo.new.bar = make_thread
}

use memory

Thread.new { loop { Array.new(1_000_000); sleep 0.1 } }

loop do
puts Foo.count
GC.start
sleep 1
end

Cheers

robert

On Wed, Sep 17, 2008 at 10:07 AM, Brian C. [email protected]
wrote:

attr_accessor :bar

object which existed at the time, even though inside make_thread it is

Thanks,

Brian.

The threads are not garbage collected until they terminate and so the
Foo instances are not being GC’d. You’re not sleeping long enough at
the end of your script. Try using something like:

GC.start
sleep 20
GC.start

and see how many instances are still hanging around at the end of your
script.

Regards,
Sean

On Sep 17, 2008, at 8:49 AM, Brian C. wrote:

Thread.new { sleep 10000 }
end

and run it again.

The big arrays are happily garbage-collected; the Foos are not.

Posted via http://www.ruby-forum.com/.

but that it expected? you have a Foo which referes to a Thread which
has not died that is itself referred to by the global Thread.list

cfp:~ > ruby -e’ 41.times{ Thread.new{ sleep } }; p Thread.list.size ’
42

so if they when the threads die, the Foos are freed

cfp:~ > cat a.rb
require ‘yaml’

def make_thread
Thread.new { sleep 2 }
end

class Foo
attr_accessor :bar
end

n = Integer(ARGV.first || 1024)

3.times do
info = {}

n.times { |i|
f = Foo.new
f.bar = make_thread
}

info.update ‘before’ => ObjectSpace.each_object(Foo){}

sleep 2.42
GC.start

info.update ‘after’ => ObjectSpace.each_object(Foo){}

y info
end

cfp:~ > ruby a.rb

after: 2
before: 1024

after: 2
before: 1026

after: 2
before: 1026

a @ http://codeforpeople.com/

Ara Howard wrote:

but that it expected? you have a Foo which referes to a Thread which
has not died that is itself referred to by the global Thread.list

cfp:~ > ruby -e’ 41.times{ Thread.new{ sleep } }; p Thread.list.size ’
42

Foo ------------>
thread
thread list ---->

I don’t think this should prevent garbage collection of Foo, if nothing
is holding a reference to Foo

2008/9/17 ara.t.howard [email protected]:

to

def make_thread
Thread.new { sleep 10000 }
end

and run it again.

Good point!

but that it expected? you have a Foo which referes to a Thread which has
not died that is itself referred to by the global Thread.list

Yeah, but there is no reference back to the Foos so they could be
collected.

If you change the code to

def make_thread
Thread.new { loop { sleep 2 } }
end

1000.times {
Foo.new.bar = “x” * rand(100) # make_thread
}

Then Foo instances are quickly removed although 1 is still referenced.
The only explanation I have ATM is that the thread stack might be
copied somehow and thus keep the ref alive. Interesting enough ruby19
behaves much differently: the count goes down to 0 instead of 1. But
with the old version (i.e. Foo.new.bar = make_thread) I get thread
creation errors. With some changing the number actually goes down to
0:

17:43:31 Temp$ cat gc.rb

def make_thread
Thread.new { loop { sleep 2 } }

Thread.new { loop { Array.new(1_000_000); sleep 0.1 } }

end

class Module
def count
c = 0
ObjectSpace.each_object(self) { c += 1 }
c
end
end

Foo = Struct.new :bar

10.times {

Foo.new.bar = “x” * rand(100)

Foo.new.bar = make_thread
}

puts “Threads created”

use memory

Thread.new { loop { Array.new(1_000_000); sleep 0.1 } }

loop do
puts Foo.count
GC.start
sleep 1
end

17:43:37 Temp$ ruby19 --version
ruby 1.9.0 (2008-03-01 revision 15664) [i386-cygwin]
17:43:48 Temp$ ruby19 gc.rb
Threads created
10
0
0
0
0
0
0
0
0
0
17:44:01 Temp$

Kind regards

robert

but threads do hold their context

Could you define “context” in this, erm, context? :slight_smile:

I thought that ‘def’ started a new scope/context. If a thread is started
from within def, how could it know about objects which are only
referenced from outside that scope?

It’s as if the thread is holding on to bits of stack or bindings which
really belong to another thread.

On Sep 17, 2008, at 10:21 AM, Brian C. wrote:

thread list ---->

I don’t think this should prevent garbage collection of Foo, if
nothing
is holding a reference to Foo

Posted via http://www.ruby-forum.com/.

but threads do hold their context

void
rb_thread_schedule(void)
{
thread_debug(“rb_thread_schedule\n”);
if (!rb_thread_alone()) {
rb_thread_t *th = GET_THREAD();

thread_debug(“rb_thread_schedule/switch start\n”);

rb_gc_save_machine_context
(th); // ← global state
native_mutex_unlock(&th->vm->global_interpreter_lock);
{
native_thread_yield();
}
native_mutex_lock(&th->vm->global_interpreter_lock);

rb_thread_set_current(th);
thread_debug(“rb_thread_schedule/switch done\n”);

RUBY_VM_CHECK_INTS();
}
}

a @ http://codeforpeople.com/

On Sep 17, 2008, at 3:24 PM, Brian C. wrote:

Could you define “context” in this, erm, context? :slight_smile:

I thought that ‘def’ started a new scope/context. If a thread is
started
from within def, how could it know about objects which are only
referenced from outside that scope?

nope - we’ll have to wait for matz! :wink:

It’s as if the thread is holding on to bits of stack or bindings which
really belong to another thread.

yeah - it certainly looks exactly that way - push stack and all…

a @ http://codeforpeople.com/