I wondered if someone can explain the following behaviour to me.
For some reason, in the following code the instances of object Foo are
prevented from being garbage collected.
def make_thread
Thread.new { sleep 2 }
end
class Foo
attr_accessor :bar
end
10.times { |i|
f = Foo.new
f.bar = make_thread
}
GC.start
sleep 1
GC.start
ObjectSpace.each_object(Foo) { |o| p o }
But if I modify the code as follows, then the Foo objects are
garbage-collected just fine:
— tst.rb 2008-09-17 10:00:27.000000000 +0100
+++ tst2.rb 2008-09-17 10:03:04.000000000 +0100
@@ -6,9 +6,10 @@
attr_accessor :bar
end
- threads = (0…9).collect { make_thread }
10.times { |i|
f = Foo.new
GC.start
It’s as if the thread created by Thread.new keeps a reference to the Foo
object which existed at the time, even though inside make_thread it is
out of scope, so it shouldn’t even know about it.
By tweaking the ‘sleep’ values, it also seems that when the thread
terminates it then does permit the Foo instance to be garbage-collected.
Any ideas as to what’s going on? I get the same results on these two
versions of Ruby:
ruby 1.8.4 (2005-12-24) [i486-linux]
ruby 1.8.6 (2008-03-03 patchlevel 114) [i686-linux]
Thanks,
Brian.
2008/9/17 Brian C. [email protected]:
I wondered if someone can explain the following behaviour to me.
There is no guarantee that all collectible objects are indeed
collected during a GC run. GC is not really deterministic - at least
from the user’s point of view.
For some reason, in the following code the instances of object Foo are
prevented from being garbage collected.
No, they are not collected. This is something different!
Kind regards
robert
Robert K. wrote:
For some reason, in the following code the instances of object Foo are
prevented from being garbage collected.
No, they are not collected. This is something different!
OK, please consider my question rephrased as “why are these objects not
collected?”
Let me increase the number of objects:
def make_thread
Thread.new { sleep 10000 }
end
class Foo
attr_accessor :bar
end
10000.times { |i|
f = Foo.new
f.bar = make_thread
}
GC.start
sleep 1
GC.start
count = 0
ObjectSpace.each_object(Foo) { |o| count += 1 }
puts “#{count} objects”
On my machine, this program uses about 150MB of RAM, and all 10,000
objects are remaining at the end of the run.
Have I done something here which is “wrong”? Are the known pitfalls of
the garbage collector documented anywhere, or ways to write programs so
as to avoid them? Or can it simply not be relied upon, ever?
Regards,
Brian.
Your version only works because the threads are dying after 2 seconds.
Change
def make_thread
Thread.new { sleep 2 }
end
to
def make_thread
Thread.new { sleep 10000 }
end
and run it again.
The big arrays are happily garbage-collected; the Foos are not.
2008/9/17 Brian C. [email protected]:
f.bar = make_thread
On my machine, this program uses about 150MB of RAM, and all 10,000
objects are remaining at the end of the run.
Have I done something here which is “wrong”? Are the known pitfalls of
the garbage collector documented anywhere, or ways to write programs so
as to avoid them?
You probably just did not allocate enough new objects to make the GC
collect stuff. AFAIK there is also a minimal process size under which
no GC occurs.
Or can it simply not be relied upon, ever?
Yes, it can. Try this
def make_thread
Thread.new { sleep 2 }
end
class Module
def count
c = 0
ObjectSpace.each_object(self) { c += 1 }
c
end
end
Foo = Struct.new :bar
1000.times {
Foo.new.bar = make_thread
}
use memory
Thread.new { loop { Array.new(1_000_000); sleep 0.1 } }
loop do
puts Foo.count
GC.start
sleep 1
end
Cheers
robert
On Wed, Sep 17, 2008 at 10:07 AM, Brian C. [email protected]
wrote:
attr_accessor :bar
object which existed at the time, even though inside make_thread it is
Thanks,
Brian.
The threads are not garbage collected until they terminate and so the
Foo instances are not being GC’d. You’re not sleeping long enough at
the end of your script. Try using something like:
GC.start
sleep 20
GC.start
and see how many instances are still hanging around at the end of your
script.
Regards,
Sean
On Sep 17, 2008, at 8:49 AM, Brian C. wrote:
Thread.new { sleep 10000 }
end
and run it again.
The big arrays are happily garbage-collected; the Foos are not.
Posted via http://www.ruby-forum.com/.
but that it expected? you have a Foo which referes to a Thread which
has not died that is itself referred to by the global Thread.list
cfp:~ > ruby -e’ 41.times{ Thread.new{ sleep } }; p Thread.list.size ’
42
so if they when the threads die, the Foos are freed
cfp:~ > cat a.rb
require ‘yaml’
def make_thread
Thread.new { sleep 2 }
end
class Foo
attr_accessor :bar
end
n = Integer(ARGV.first || 1024)
3.times do
info = {}
n.times { |i|
f = Foo.new
f.bar = make_thread
}
info.update ‘before’ => ObjectSpace.each_object(Foo){}
sleep 2.42
GC.start
info.update ‘after’ => ObjectSpace.each_object(Foo){}
y info
end
cfp:~ > ruby a.rb
after: 2
before: 1024
after: 2
before: 1026
after: 2
before: 1026
a @ http://codeforpeople.com/
Ara Howard wrote:
but that it expected? you have a Foo which referes to a Thread which
has not died that is itself referred to by the global Thread.list
cfp:~ > ruby -e’ 41.times{ Thread.new{ sleep } }; p Thread.list.size ’
42
Foo ------------>
thread
thread list ---->
I don’t think this should prevent garbage collection of Foo, if nothing
is holding a reference to Foo
2008/9/17 ara.t.howard [email protected]:
to
def make_thread
Thread.new { sleep 10000 }
end
and run it again.
Good point!
but that it expected? you have a Foo which referes to a Thread which has
not died that is itself referred to by the global Thread.list
Yeah, but there is no reference back to the Foos so they could be
collected.
If you change the code to
def make_thread
Thread.new { loop { sleep 2 } }
end
…
1000.times {
Foo.new.bar = “x” * rand(100) # make_thread
}
Then Foo instances are quickly removed although 1 is still referenced.
The only explanation I have ATM is that the thread stack might be
copied somehow and thus keep the ref alive. Interesting enough ruby19
behaves much differently: the count goes down to 0 instead of 1. But
with the old version (i.e. Foo.new.bar = make_thread) I get thread
creation errors. With some changing the number actually goes down to
0:
17:43:31 Temp$ cat gc.rb
def make_thread
Thread.new { loop { sleep 2 } }
Thread.new { loop { Array.new(1_000_000); sleep 0.1 } }
end
class Module
def count
c = 0
ObjectSpace.each_object(self) { c += 1 }
c
end
end
Foo = Struct.new :bar
10.times {
Foo.new.bar = “x” * rand(100)
Foo.new.bar = make_thread
}
puts “Threads created”
use memory
Thread.new { loop { Array.new(1_000_000); sleep 0.1 } }
loop do
puts Foo.count
GC.start
sleep 1
end
17:43:37 Temp$ ruby19 --version
ruby 1.9.0 (2008-03-01 revision 15664) [i386-cygwin]
17:43:48 Temp$ ruby19 gc.rb
Threads created
10
0
0
0
0
0
0
0
0
0
17:44:01 Temp$
Kind regards
robert
but threads do hold their context
Could you define “context” in this, erm, context?
I thought that ‘def’ started a new scope/context. If a thread is started
from within def, how could it know about objects which are only
referenced from outside that scope?
It’s as if the thread is holding on to bits of stack or bindings which
really belong to another thread.
On Sep 17, 2008, at 10:21 AM, Brian C. wrote:
thread list ---->
I don’t think this should prevent garbage collection of Foo, if
nothing
is holding a reference to Foo
Posted via http://www.ruby-forum.com/.
but threads do hold their context
void
rb_thread_schedule(void)
{
thread_debug(“rb_thread_schedule\n”);
if (!rb_thread_alone()) {
rb_thread_t *th = GET_THREAD();
thread_debug(“rb_thread_schedule/switch start\n”);
rb_gc_save_machine_context
(th); // ← global state
native_mutex_unlock(&th->vm->global_interpreter_lock);
{
native_thread_yield();
}
native_mutex_lock(&th->vm->global_interpreter_lock);
rb_thread_set_current(th);
thread_debug(“rb_thread_schedule/switch done\n”);
RUBY_VM_CHECK_INTS();
}
}
a @ http://codeforpeople.com/
On Sep 17, 2008, at 3:24 PM, Brian C. wrote:
Could you define “context” in this, erm, context?
I thought that ‘def’ started a new scope/context. If a thread is
started
from within def, how could it know about objects which are only
referenced from outside that scope?
nope - we’ll have to wait for matz!
It’s as if the thread is holding on to bits of stack or bindings which
really belong to another thread.
yeah - it certainly looks exactly that way - push stack and all…
a @ http://codeforpeople.com/