Forum: Ruby-core Garbage Collection Question

Posted by Asher Haig (asher)
on 2010-08-25 05:42
(Received via mailing list)
This question is no doubt a function of my own lack of understanding, 
but I think that asking it will at least help some other folks see 
what's going on with the internals during garbage collection.

It's best summarized by example code and summary of my understanding of 
the resulting output so far as marking an object is concerned.

The question in short: when an object goes out of scope and has no 
references that were left to it, how does it get collected? Conceptually 
this seems easy - the GC walks the heap to see if the contents are still 
valid pointers to the heap. Once the pointer is invalid, it makes total 
sense to me how the rest proceeds. But how does the pointer to the heap 
ever become invalid?

Hopefully my example and summary will clarify:

require 'pp'

class Hash::Weak < Hash

  def []( key )

    # get the stored ID - a FixNum, not an object reference to our 
weak-referenced object
    obj_id = super( key.to_sym )

    # theoretically this should cause non-referenced objects to get 
cleaned up
    # so long as nothing looks like a pointer or reference to it
    ObjectSpace.garbage_collect

    # now get our object from ID
    # if it had no references it should have been GC'd and we should get 
an
    # rb_eRangeError "is not id value" (expected) or "is recycled 
object" (possible)
    obj = ObjectSpace._id2ref( obj_id )

    return obj

  end

  def []=( key, object )

    # FixNum have a constant ID for value, so can't be copied and can't 
be garbage collected
    # so object.__id__ cannot be a reference to a child of object and 
therefore cannot prevent
    # garbage collection on the object
    super( key.to_sym, object.__id__ )

  end

end

##################################################

weak_hash = Hash::Weak.new

class TestClass
end
test_object  =  TestClass.new

puts 'storing test object'
weak_hash[ :key ] = test_object
puts 'hash now contains object id: ' + weak_hash.pretty_inspect

print 'retrieving stored test object from hash (should work/be non-nil): 
'
valid_key = weak_hash[ :key ]
pp valid_key

class AnotherClass
end

puts 'setting variable referring to test object (ID: ' + 
test_object.__id__.to_s + ') to nil'
test_object = nil
puts 'ID for variable referring to test object is now: ' + 
test_object.__id__.to_s

print 'getting test object (should fail with rb_eRangeError): '
invalid_key = weak_hash[ :key ]
pp invalid_key
# error - returns valid key

##################################################

# object is created, given an id
# variable is assigned to id
# variable is changed to new object (including nil)
# variable gets the id of new object
# previous reference made by variable remains in object space (no valid 
references)
# gc starts
# rb_gc_mark calls gc_mark, marks VM instance 
(#<RubyVM:0x000001008700b8>)
# gc_mark calls gc_mark_children, marks all children of VM
# first, the VM class (RubyVM), and then its children
#   then its class instance (#<Class:RubyVM>), and then its children
#     then its class instance (#<Class:#<Class:RubyVM>>) and its 
children
#       then its class instance (#<Class:#<Class:Class>>) and its 
children
#         then gc_mark_children calls mark_tbl to mark its table (the 
class table)
#           mark_tbl marks all children of the class table, starting 
with Class
#             Class marks its children, first of all #<Class:Module>
#               #<Class:Module> marks its children, first of all 
#<Class:#<Class:Module>>
#                 #<Class:#<Class:Module>> marks its children, which 
includes a table of classes
#                   mark_tbl marks each of classes, which first includes 
#<Class:Object>
#                     #<Class:Object> has a table of entries that it 
marks, first of all Object
#                       Object has a table that it marks, first of all 
its binding context (presumably main first?) #<Binding:0x00000100870068>
#                         #<Binding:0x00000100870068> marks its 
children, which calls binding_mark, which calls rb_gc_mark, which calls 
gc_mark on the Ruby environment: #<RubyVM::Env:0x00000100854bd8>
#                           #<RubyVM::Env:0x00000100854bd8> marks its 
children which calls env_mark
#                             env_mark calls rb_gc_mark_locations on the 
range covered by the environment's declared memory space, which calls 
gc_mark_locations
#                               gc_mark_locations calls 
mark_locations_array on the space marked by the start and length of 
environment
#                                 mark_locations_array looks at the 
environment as an array of long, and calls is_pointer_to_heap on each 
one
#                                   if (long)slice is the address of a 
valid pointer on the heap, returns TRUE, which causes gc_mark to be 
called on the object
#*****                                object, defined by ID, matches 
with (long)slice because it has not yet been collected; it is therefore 
marked as still existing because it has a valid pointer
# =>                                  if this were true, no object would 
ever be garbage collected; so how is any object ever garbage collected?

Any help understanding what's going on is much appreciated.

Thanks,
Asher
Posted by Roger Pack (Guest)
on 2010-08-26 18:35
(Received via mailing list)
> The question in short: when an object goes out of scope and has no
> references that were left to it, how does it get collected? Conceptually
> this seems easy - the GC walks the heap to see if the contents are still
> valid pointers to the heap. Once the pointer is invalid, it makes total
> sense to me how the rest proceeds. But how does the pointer to the heap ever
> become invalid?


It walks the stack, and (for sake of ease of understanding) marks all
pointers still on the stack as "live"
then it marks all of their children as "live"
then all of their grandchildren, etc.

Then it traverses the entire heap, looking for objects that haven't
been marked as live, and "frees" them.

NB that these two stages are separate, so they don't conflict.
HTH.
-r
Posted by Asher Haig (asher)
on 2010-08-26 18:51
(Received via mailing list)
Right - so how does a pointer ever get off the stack?

For instance, in my example, where the variable with reference to the 
object has been assigned nil - the same thing occurs if the variable 
goes out of scope.

So in both of those cases, the object "should" be garbage collected; I 
understand that it's possible, due to conservative GC, that it might 
mistake a number on the stack (a long), etc. as a valid pointer, but 
generally when GC runs it should decide that the var (which has no valid 
ruby references) is no longer live and should be GC'd. Or am I missing 
something?

So we have a var with no references in Ruby that is being marked as live 
by the GC because the pointer has not yet been deallocated. So how does 
it ever get deallocated in order to not be marked as live?

If what I am seeing is the case (and I assume it cannot be and that I am 
missing something) then the object would never be garbage collected.

So how does GC actually occur? What causes the pointer to be 
deallocated?

Asher
Posted by Roger Pack (Guest)
on 2010-08-26 20:36
(Received via mailing list)
> Right - so how does a pointer ever get off the stack?
>
> For instance, in my example, where the variable with reference to the object has been assigned nil - the same thing occurs if the variable goes out of scope.
>
> So in both of those cases, the object "should" be garbage collected; I understand that it's possible, due to conservative GC, that it might mistake a number on the stack (a long), etc. as a valid pointer, but generally when GC runs it should decide that the var (which has no valid ruby references) is no longer live and should be GC'd. Or am I missing something?

That's right.

> So we have a var with no references in Ruby that is being marked as live by the GC because the pointer has not yet been deallocated. So how does it ever get deallocated in order to not be marked as live?

Presumably it is "not being collected" because of a false positive on 
the stack.
So if you go "up and down" long enough on the stack, it will overwrite
the false positive eventually (it's hoped), and thus clear the false
positive.
Posted by unknown (Guest)
on 2010-08-26 21:35
(Received via mailing list)
On Thu, Aug 26, 2010 at 2:36 PM, Roger Pack <rogerdpack2@gmail.com> 
wrote:
> Presumably it is "not being collected" because of a false positive on the stack.
> So if you go "up and down" long enough on the stack, it will overwrite
> the false positive eventually (it's hoped), and thus clear the false
> positive.

https://sites.google.com/site/brentsrubypatches/
MBARI3.patch:
Ruby's conservative garbage collector cannot tell whether machine
words on the 'C' stack are object pointers or integers, etc. because
there is no type information associated with them.  A conservative
collector works by "conserving" every object to which there could
possibly be a reference.  In the 1.8 and 1.6 series Ruby
implementations, this means scanning the stack of each Thread and
Continuation assuming that every word is an object pointer if it has a
value could be so interpreted.  In practice, this is not as bad is it
may seem, as Ruby's collector does not consider pointers "inside" an
object to be valid -- only those that point to its exact base address.
 So, even assuming thousands of live objects, a 32-bit address space
will remain very sparsely populated with valid object pointers.

The garbage collector's leaking memory is not really its own fault.
The trouble is that the 'C' machine stack is filled with object
references.  The main reason for this is that gcc compilers create
overly large stack frames and do not initialize many values in them.
Certain 'C' constructs used in the Ruby interpreter's core recursive
expression evaluator generate especially large, sparse stack frames.
The function rb_eval() is the worst offender, creating kilobyte sized
stack frames for each invocation of a function that may call itself
hundreds of times.  This results in stacks that are hundreds of
kilobytes, often full of old, dead object references that may never go
away.  If there were a gcc compiler option to initialize all local
variables to zero whever a new stack frame is built, that would let
the collector do its work properly, but no such option exists.
Posted by Kurt Stephens (Guest)
on 2010-08-27 04:43
(Received via mailing list)
On 8/26/10 11:51 AM, Asher wrote:
> Right - so how does a pointer ever get off the stack?
>
When a C function returns, the C stack pointer register (usually called
"SP") is reset to the frame pointer (sometimes this register is called
"FP").  The FP points to the current function arguments.  The area
between the SP and the FP +- the space for arguments (and the other
machine registers) represent the local variables, temporaries and
arguments of the current function call (sometimes called an "activation
record").

Load any C program under a debugger and you can see the assembly code.

The MRI GC knows where "top" (SP) and the bottom of the stack is because
of mostly portable conventions on how C compilers generate code that
manipulate SP and FP and how the operating system lays out the process'
memory.  The stack, the machine registers and some global variables are
part of what is sometimes called the "root set".

The MRI GC scans the root set for values that "look like they point to
Ruby objects" and "marks" those objects recursively as "in use".  Any
unmarked objects ("not in use") are definitely not referenced by
anything else and can be deallocated ("sweeped").  The GC must
"stop-the-world" while it does this "marking" and "sweeping" -- nothing
else can happen till this finishes.   If the GC couldn't sweep anything,
it allocates more memory from the OS (by calling malloc(), which calls
something at a much lower level (sbrk() or mmap() or something else).

> For instance, in my example, where the variable with reference to the object has been assigned nil - the same thing occurs if the variable goes out of scope.
>
> So in both of those cases, the object "should" be garbage collected; I understand that it's possible, due to conservative GC, that it might mistake a number on the stack (a long), etc. as a valid pointer, but generally when GC runs it should decide that the var (which has no valid ruby references) is no longer live and should be GC'd. Or am I missing something?
>
> So we have a var with no references in Ruby that is being marked as live by the GC because the pointer has not yet been deallocated. So how does it ever get deallocated in order to not be marked as live?
>
> If what I am seeing is the case (and I assume it cannot be and that I am missing something) then the object would never be garbage collected.
>
> So how does GC actually occur?

Collection occurs in MRI when a new object is needed and there are no
unused objects left around and/or there was a certain number of
allocations since the last GC.

> What causes the pointer to be deallocated?
>
"Pointers" are never allocated or deallocated as in malloc()/free().
Only objects that have no references to them are deallocated.

The C compiler generates code that simply increments or decrements the
SP or changes the FP -- Stacks are FIFOs.

The MRI GC is a very simple "stop-the-world", "mark-and-sweep"
"conservative" collector.  Conservative meaning "treat anything that
looks like a pointer to an object as a pointer to an object".  This can
cause conservative collectors to keep some objects around longer than
they should.  This is also be cause most C compilers leave garbage (old
pointers) on the stack.

The Rubinius GC is different.  The MRI Enterprise Edition uses
additional techniques on top of the standard MRI GC to improve
performance in web servers and long-running processes.


>> been marked as live, and "frees" them.
>>
>> NB that these two stages are separate, so they don't conflict.
>> HTH.
>> -r
>
>

Yea, what Roger said.  :)

More here:

http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29

A widely-ported and long-used GC can be downloaded here:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/

ACM has a long rich history of GC research -- there's even a yearly
symposium on the subject.  Appel's book is a great reference.
This book (sadly out of print) is a good introduction:

http://www.amazon.com/Topics-Advanced-Language-Implementation-Peter/dp/0262121514

There are far more complex GC algorithms that perform better in most
cases.  Conservative mark-and-sweep collectors are far easier to
interface with C code than other approaches -- most others require
considerable cooperation between the code and the collector.

HTH^2,
-- KAS
Posted by Asher Haig (asher)
on 2010-08-27 17:23
(Received via mailing list)
I very much appreciate the response, and this is helpful in describing 
the narrative, but it's still a few steps behind my question - but it 
may very well have clarified some points that help us get there.

Let's stick with the example: a local variable is set as a reference to 
an object, the local variable is then set to nil so there is no longer a 
live reference to the object. No other ruby space commands have gone on, 
so unless Ruby is keeping junk behind the scenes, there should be no 
references - not on the Ruby stack, not on the C stack. How does this 
object get collected? As shown by the example, it is missed during the 
next attempt to GC (as well as any repeated attempts at this point).

So what will change to make that object collectable? Are you suggesting 
that because it is in Ruby's root node that it gets treated such that it 
won't be GC'd until the program terminates?

Let's assume this is the case. I should therefore be able to write a 
script that creates a non-root object as a child to another object 
inside method scope, allow that method to go out of scope, and expect 
that the object will be GC'd (as it is neither a root node nor does it 
have any live references).

This seems to be validated by the following Ruby code:

>     # so long as nothing looks like a pointer or reference to it
>   
> 
>     $weak_hash[ :key ] = child_test_object
>     puts 'hash now contains object id: ' + $weak_hash.pretty_inspect    
>   end
> end
> test_object  =  TestClass.new
> 
> test_object.test_method
> puts 'id in hash should no longer be valid, as it is out of scope: '
> invalid_key = $weak_hash[ :key ]
> pp invalid_key

Output:

> storing test object
> hash now contains object id: {:key=>2160173880}
> id in hash should no longer be valid, as it is out of scope:
> /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:21:in `_id2ref': 0x00000080c1a338 is recycled object (RangeError)  
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:21:in `[]'
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:56:in `<main>'

So that works as expected (so long as GC is run manually; if not run, 
object is obviously still valid).

So let's try the same thing in this example that we did with the root 
node example:

>     $weak_hash[ :key ] = child_test_object
> end
> test_object  =  TestClass.new
> 
> test_object.test_method
> puts 'id in hash should no longer be valid, as it is out of scope: '
> invalid_key = $weak_hash[ :key ]
> pp invalid_key

Output:

>   
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:62:in `<main>'

So that also works as expected.

It seems the only time that it does not work as expected, then, is when 
the object was instantiated with a reference in the root node.

Of course, that is one of the most likely places for people to 
instantiate a reference. So can anything be done about this? Are we 
simply doomed to wait until program termination for any objects 
allocated in the root node to disappear?

I intend to look into the patch suggested by brabuhr@gmail.com 
(https://sites.google.com/site/brentsrubypatches/), which (so far as 
this issue is concerned) appears to amount to:

> VALUE *rb_gc_stack_end = (VALUE *)STACK_GROW_DIRECTION;
> #define rb_gc_wipe_stack() {       \
>   VALUE *sp = alloca(0);             \
>   VALUE *end = rb_gc_stack_end;  \
>   rb_gc_stack_end = sp;              \
>   __stack_zero(end, sp);         \

And some other basic support. I will follow up on that once I have some 
time to experiment (particularly sense the patch is intended for 1.8.7 
not 1.9.2). Any particular thoughts on this approach? Presumably there 
is some reason it has not been patched to do so?

In any case, any thoughts on any of this will be much appreciated.

Asher

On Aug 26, 2010, at 10:43 PM, Kurt Stephens wrote:

> 
> Collection occurs in MRI when a new object is needed and there are no unused objects left around and/or there was a certain number of allocations since the last GC.
> 
>> What causes the pointer to be deallocated?
>> 
> "Pointers" are never allocated or deallocated as in malloc()/free(). Only objects that have no references to them are deallocated.
> 
> The C compiler generates code that simply increments or decrements the SP or changes the FP -- Stacks are FIFOs.
> 
> The MRI GC is a very simple "stop-the-world", "mark-and-sweep" "conservative" collector.  Conservative meaning "treat anything that looks like a pointer to an object as a pointer to an object".  This can cause conservative collectors to keep some objects around longer than they should.  This is also be cause most C compilers leave garbage (old pointers) on the stack.

<snip>
Posted by Evan Phoenix (Guest)
on 2010-08-27 18:16
(Received via mailing list)
You have introduced something called a "root node" without defining it. 
What do you mean by this?

I'm assuming here you mean that in your case, if you allocate the object 
in the script body, then set the local to nil, you can observe that the 
object appears to not be collected. As has been stated in the thread 
already, this is an artifact of the conservative GC. Even though you 
have set the local to nil, a reference to the object may still remain on 
the C stack. That reference can't be seen by ruby code because it is in 
stack memory that gcc setup and didn't clear when the value wasn't 
needed anymore.

This is unfortunate, but not the end of the world. It doesn't happen 
with every object allocated in a script body, only sometimes. The patch 
set you were pointed to goes to lengths to clear the stack space as much 
as it can so that there are none of these phantom references to confuse 
the GC. It does this by breaking up the main eval function into smaller 
functions (allowing stack space to be allocated and deallocated within 
the eval itself) and forcibly clearing the stack with memset.

I hope this clears it up.

 - Evan

On Aug 27, 2010, at 8:22 AM, Asher wrote:

>> require 'pp'
>>     ObjectSpace.garbage_collect
>>   def []=( key, object )
>> ##################################################
>>     puts 'hash now contains object id: ' + $weak_hash.pretty_inspect    
> 
> 
>>     $weak_hash[ :key ] = child_test_object
>> end
>> hash now contains object id: {:key=>2160328280}
> 
>>   VALUE *sp = alloca(0);             \
>>   VALUE *end = rb_gc_stack_end;  \
>>   rb_gc_stack_end = sp;              \
>>   __stack_zero(end, sp);         \
> 
> And some other basic support. I will follow up on that once I have some time to experiment (particularly sense the patch is intended for 1.8.7 not 1.9.2). Any particular thoughts on this approach? Presumably there is some reason it has not been patched to do so? 
> 
> In any case, any thoughts on any of this will be much appreciated.
> 
> Asher

<snip>
Posted by Asher Haig (asher)
on 2010-08-27 18:27
(Received via mailing list)
On Aug 27, 2010, at 11:22 AM, Asher wrote:

> I intend to look into the patch suggested by brabuhr@gmail.com (https://sites.google.com/site/brentsrubypatches/), which (so far as this issue is concerned) appears to amount to:
> 
>> VALUE *rb_gc_stack_end = (VALUE *)STACK_GROW_DIRECTION;
>> #define rb_gc_wipe_stack() {       \
>>   VALUE *sp = alloca(0);             \
>>   VALUE *end = rb_gc_stack_end;  \
>>   rb_gc_stack_end = sp;              \
>>   __stack_zero(end, sp);         \
> 
> And some other basic support. I will follow up on that once I have some time to experiment (particularly sense the patch is intended for 1.8.7 not 1.9.2). Any particular thoughts on this approach? Presumably there is some reason it has not been patched to do so? 

So as I understand it the problem is:

The basic Ruby stack looks like:

Ruby stack, root node FP*
  ruby root node locals => st_ivar_tbl
  ruby root node stack SP* (ruby stack frame 1 after the activation 
record)

So when a function call is made the stack grows to look like:

  ruby root node locals => st_ivar_tbl
  ruby root node stack SP* (ruby stack frame 1 after the activation 
record)
    ruby root first child node locals => st_ivar_tbl
    ruby root first child node CP*

So when the first child node finishes the CP* moves back to the SP and 
st_ivar_tbl is no longer part of the stack, which is why nested local 
variables get GC'd as expected.

But when the local variable in the root node is set to nil, the local 
var data for object ID in st_ivar_tbl is set to 4 instead of object ID. 
This leaves a valid pointer object ID with no references.

But "where" is this object ID pointer if its reference in the 
st_ivar_tbl is now replaced with Qnil? I presume the explanation for 
this is that the object actually leaves on the heap in ObjectSpace 
rather than in local variable space, which means that the object is 
allocated and a reference is given to st_ivar_table, so when 
st_ivar_table's reference is gone there is still a valid reference in 
ObjectSpace (the heap).

So it seems that the root node's object is remaining around even though 
there are no references because its frame has not been cleared. Is this 
understanding correct?

So if the reference to the object is always in the heap, how does the 
heap's pointer become invalidated when st_ivar_tbl is cleared, as in the 
examples where it works "as expected"?

Perhaps there is something fundamental about local variable I am missing 
in my description here? I am trying to work through these things, so 
help is appreciated.

Thanks for patience,
Asher
Posted by Asher Haig (asher)
on 2010-08-27 18:33
(Received via mailing list)
On Aug 27, 2010, at 12:09 PM, Evan Phoenix wrote:

> You have introduced something called a "root node" without defining it. What do you mean by this?

The first node that runs when you run a script (ie. call ruby_run_node 
), which also defines the set of root references.

> I'm assuming here you mean that in your case, if you allocate the object in the script body, then set the local to nil, you can observe that the object appears to not be collected.

What you can see with my examples, though, is that it does happen with 
_all_ objects allocated on the root node

> As has been stated in the thread already, this is an artifact of the conservative GC. Even though you have set the local to nil, a reference to the object may still remain on the C stack. That reference can't be seen by ruby code because it is in stack memory that gcc setup and didn't clear when the value wasn't needed anymore.

Right - I understand this conceptually. I want to know "where" on the C 
stack this "might" remain. It shouldn't be an obtuse question - Ruby is 
allocating each and every object, and I'm not using any C pointers for 
the particular example, so there is nothing else in my C stack (in this 
case, "I" don't have a C stack, only Ruby does).

So Ruby is holding a reference somewhere in its stack, possibly because 
of

> This is unfortunate, but not the end of the world.

In my particular use case (not the example), it is the end of the world 
and requires re-designing the entire way I'm handling T_DATA, such that 
I pass back a new T_DATA every time an existing underlying C object is 
requested. I want to store the first T_DATA created for this object in a 
weak hash and pass it back as requested - allowing it to be collected as 
appropriate. This seems to work in all contexts but the root node, where 
the result is that one expects to get a GC'd object (which can thus be 
caught and returned as nil) but ends up with a valid obj (which 
shouldn't be valid).

The result is that one can ask for an object that doesn't exist, and 
instead of being told that it doesn't exist get back an old object that 
wasn't what one wanted (one wanted to know that it did not exist in this 
context, not get whatever random last object was created in the slot).

This example also, I believe, makes it evident that "root node" is not 
necessarily the actual root but can also be any root relative to 
execution context. In other words, a variable _will not_ be GC'd until 
one has left the frame in which it was defined, even if all references 
are set nil.

Example:

  it "can be created with a name string and home directory string" do
    @environment = RPDB::Environment.new( $environment_name.to_s, 
$environment_path )
    @environment.should_not == nil
    @environment.is_a?( RPDB::Environment ).should == true
    @environment.directory.should == $environment_path
  end

  it "can be created with a name symbol" do
    environment = RPDB.environment_with_name( $environment_name )
    environment.should == nil
    @environment = RPDB::Environment.new( $environment_name )
    @environment.should_not == nil
    @environment.is_a?( RPDB::Environment ).should == true
    @environment.directory.should == './'
  end

The last line of the second example does not end up with the default 
path ('./') because an existing reference is found when it should not 
be.

It seems, thus, that writing a weak hash is impossible given the current 
state of GC. This seems rather problematic.

> It doesn't happen with every object allocated in a script body, only sometimes.

No, it happens _every_ time. See examples.

> The patch set you were pointed to goes to lengths to clear the stack space as much as it can so that there are none of these phantom references to confuse the GC. It does this by breaking up the main eval function into smaller functions (allowing stack space to be allocated and deallocated within the eval itself) and forcibly clearing the stack with memset.

Right... and I was trying to look where that would be appropriately 
integrated into 1.9.2, but my attempts have not been successful. I 
believe that this is an indication that that is not the problem in 
question here- that the problem has to do with the clearing of the 
present stack, rather than the clearing of stack frames that have been 
passed.

In other words, the patch clears old stack frames, but the problem here 
is that we have data remaining in the present stack frame that is not 
expected to still exist.

This is obviously a function of the GC's conservative nature, but I am 
trying to figure out what my best option is for circumventing the 
unexpected behavior.

Additionally, On Aug 27, 2010, at 12:13 PM, Roger Pack wrote:

> Unfortunately you'll have to assume that there is still some "bad ref"
> around to it.
> One trick is to try and nest whatever you "violently" need to be
> collected deep in some sub routine, then call GC.start *after*
> recursing back up from that sub routine.


It does seem to be the answer that things are leaning toward, but I want 
to at least understand at a lower level precisely what is occurring to 
prevent this specific collection. It seems (based on my description of 
when it occurs) to be systemic rather than sporadic, so it should be 
possible to at least narrow it down to a specific place in code where a 
reference is being left, even if it is not so easy to adapt that code to 
do otherwise.

Best,
Asher
Posted by unknown (Guest)
on 2010-08-27 19:44
(Received via mailing list)
On Fri, Aug 27, 2010 at 12:33 PM, Asher <asher@ridiculouspower.com> 
wrote:
> On Aug 27, 2010, at 12:09 PM, Evan Phoenix wrote:
>> It doesn't happen with every object allocated in a script body, only sometimes.
>
> No, it happens _every_ time. See examples.

Modified original program:

##################################################

weak_hash = Hash::Weak.new

class TestClass
end

1_000.times do |n|
  test_object = TestClass.new

  weak_hash[ n ] = test_object
  valid_key = weak_hash[ n ]
  p valid_key
end

class AnotherClass
end

test_object = nil

1_000.times do |n|
  print "getting test object (#{n}) (should fail with rb_eRangeError): "
  invalid_key = weak_hash[ n ]
  p invalid_key
  # error - returns valid key
end

##################################################

Output:

$ ruby -v gc.rb
ruby 1.8.7 (2010-01-10 patchlevel 249) [i486-linux]
#<TestClass:0xb73f6d64>
#<TestClass:0xb73f6d3c>
#<TestClass:0xb73f7e80>
[...]
#<TestClass:0xb73f7e30>
#<TestClass:0xb73f7e08>
#<TestClass:0xb73f7e80>
getting test object (0) (should fail with rb_eRangeError):
#<TestClass:0xb73f7e80>
getting test object (1) (should fail with rb_eRangeError):
#<TestClass:0xb73f7e80>
getting test object (2) (should fail with rb_eRangeError):
#<TestClass:0xb73f7e80>
[...]
getting test object (31) (should fail with rb_eRangeError):
#<TestClass:0xb73f7e80>
getting test object (32) (should fail with rb_eRangeError):
#<TestClass:0xb73f7e80>
gc.rb:15:in `_id2ref': 0xdb9fbf04 is recycled object (RangeError)
        from gc.rb:15:in `[]'
        from gc.rb:54
        from gc.rb:52:in `times'
        from gc.rb:52
getting test object (33) (should fail with rb_eRangeError):
Posted by Asher Haig (asher)
on 2010-08-27 20:05
(Received via mailing list)
Trying your example 1000 times simply loops infinitely on my ruby. Not 
sure why.

Trying it 10.times and then 2.times works - throws expected error on 
first attempt to retrieve.

Trying it 1.times does not - returns uncollected object.

Your example of 1000 times causes it to work on the 33rd try.

What gives? or is your point that there is no way to predict this? If 
that's the case, why does it work the first time consistently with my 
attempts 2.times, 10.times, etc.?

Asher
Posted by Evan Phoenix (Guest)
on 2010-08-27 20:05
(Received via mailing list)
My knowledge about the insides of 1.9 is less strong than 1.8, so I'm 
not fully versed in how 1.9 now stores locals.

Anyway, one issue with your testing methodology is you don't define when 
the GC will happen. If you do some work in a method and return from it, 
even though there are no references to an object, if the GC hasn't run 
yet, then _id2ref will be able to return it. If you demand that 
returning from a local scope will cause the object be be treated as 
garbage, then you con't blindly use _id2ref. Ruby, and just about all GC 
languages, don't work that way.

Your examples do not include calling GC.start to force a GC, thus I 
wonder if this is the source of your problem. Remember that by default, 
the GC runs whenever it wants. So you can't depend on it to run an 
certain times.

 - Evan
Posted by Asher Haig (asher)
on 2010-08-27 20:08
(Received via mailing list)
It takes place in the Hash::Weak code in []:

class Hash::Weak < Hash

  def []( key )

    # get the stored ID - a FixNum, not an object reference to our 
weak-referenced object
    obj_id = super( key )

    # theoretically this should cause non-referenced objects to get 
cleaned up
    # so long as nothing looks like a pointer or reference to it
    ObjectSpace.garbage_collect

    # now get our object from ID
    # if it had no references it should have been GC'd and we should get 
an
    # rb_eRangeError "is not id value" (expected) or "is recycled 
object" (possible)
    obj = ObjectSpace._id2ref( obj_id )

    return obj

  end

  def []=( key, object )

    # FixNum have a constant ID for value, so can't be copied and can't 
be garbage collected
    # so object.__id__ cannot be a reference to a child of object and 
therefore cannot prevent
    # garbage collection on the object
    super( key, object.__id__ )

  end

end

Asher
Posted by unknown (Guest)
on 2010-08-27 20:35
(Received via mailing list)
On Fri, Aug 27, 2010 at 2:04 PM, Asher <asher@ridiculouspower.com> 
wrote:
> Trying your example 1000 times simply loops infinitely on my ruby. Not sure why.
>
> Trying it 10.times and then 2.times works - throws expected error on first attempt to retrieve.
>
> Trying it 1.times does not - returns uncollected object.
>
> Your example of 1000 times causes it to work on the 33rd try.
>
> What gives? or is your point that there is no way to predict this? If that's the case, why does it work the first time consistently with my attempts 2.times, 10.times, etc.?

Beats me :)

ruby 1.8.7 (2010-01-10 patchlevel 249) [i486-linux]
Linux 2.6.32 Ubuntu i686 GNU/Linux

I consistently see no RangeError until 34.times then none until 
38.times:

34.times RangeError when n = 0
38.times RangeError when n = 0
40.times RangeError when n = 38
43.times RangeError when n = 0
44.times RangeError when n = 42
46.times RangeError when n = 0
47.times RangeError when n = 42
48.times RangeError when n = 0
49.times RangeError when n = 38
50.times RangeError when n = 42
52.times RangeError when n = 38
53.times RangeError when n = 42
55.times RangeError when n = 38
56.times RangeError when n = 42
58.times RangeError when n = 38
59.times RangeError when n = 42
61.times RangeError when n = 0

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-linux]
Linux 2.6.18 i686 i686 i386 GNU/Linux

34.times RangeError when n = 0
35.times RangeError when n = 33
36.times RangeError when n = 33
38.times RangeError when n = 0
39.times RangeError when n = 0
40.times RangeError when n = 33
41.times RangeError when n = 38
43.times RangeError when n = 0
44.times RangeError when n = 0
45.times RangeError when n = 38
46.times RangeError when n = 0
47.times RangeError when n = 33
48.times RangeError when n = 0
49.times RangeError when n = 38
50.times RangeError when n = 47
Posted by Asher Haig (asher)
on 2010-08-30 05:46
(Received via mailing list)
On Aug 27, 2010, at 12:33 PM, Asher wrote:

> I want to know "where" on the C stack this "might" remain. It shouldn't be an obtuse question - Ruby is allocating each and every object, and I'm not using any C pointers for the particular example, so there is nothing else in my C stack (in this case, "I" don't have a C stack, only Ruby does). 


So my question comes down to:

def random_method
  # demo_var is internally mapped as a pointer to the newly created 
Object, which is instantiated on the heap.
  demo_var = Object.new
  # demo_var is internally mapped to 4
  demo_var = nil
  # GC, in env_mark, walks (among others) space demarcated by 
RubyVM::Env, which is defined by its length in objects (VALUE)
  ObjectSpace.garbage_collect
end

So the environment's memory space is evaluated as a series of long 
values (which were allocated during the compilation of the iseq), each 
of which is potentially a pointer pointing to the heap.

So as I understand, before the GC is called here we have 2 NODE_LASGN 
nodes. Is this correct?

So the first one allocates Object and assigns the reference to demo_var 
in the local var table on the stack.

The second one assigns demo_var in the local var table on the stack to 
4.

So where does the GC discover a reference to Object to test in order to 
mark? It is clear that if a reference to Object is left (invisibly) on 
the stack then it will be marked until the stack gets cleaned up. This 
would obviously not take place until the frame is taken off the stack. 
But I can't find anywhere that this would make sense. The only place 
that I see where a reference occurs that the GC is walking is in the 
locals table. But the instruction for NODE_LASGN (setlocal) changes the 
pointer value for the local variable reference. So there _shouldn't_, so 
far as I can tell, be a reference to Object; yet insofar as Object gets 
marked by gc_mark_locations (called by gc_env_mark), it has a reference 
still existing.

Can anyone help me find where this reference is occurring? My read of 
the code suggests that the GC should get "4" for the slot that would 
have been a pointer to Object, yet this isn't what happens.

Insight appreciated.

Asher
Posted by Roger Pack (Guest)
on 2010-08-30 19:37
(Received via mailing list)
>> I want to know "where" on the C stack this "might" remain. It shouldn't be an obtuse question - Ruby is allocating each and every object, and I'm not using any C pointers for the particular example, so there is nothing else in my C stack (in this case, "I" don't have a C stack, only Ruby does).

>  ObjectSpace.garbage_collect
> So where does the GC discover a reference to Object to test in order to mark? It is clear that if a reference to Object is left (invisibly) on the stack then it will be marked until the stack gets cleaned up. This would obviously not take place until the frame is taken off the stack. But I can't find anywhere that this would make sense. The only place that I see where a reference occurs that the GC is walking is in the locals table. But the instruction for NODE_LASGN (setlocal) changes the pointer value for the local variable reference. So there _shouldn't_, so far as I can tell, be a reference to Object; yet insofar as Object gets marked by gc_mark_locations (called by gc_env_mark), it has a reference still existing.
>
> Can anyone help me find where this reference is occurring? My read of the code suggests that the GC should get "4" for the slot that would have been a pointer to Object, yet this isn't what happens.

http://timetobleed.com/what-is-a-ruby-object-introducing-memprof-dump/

might help.

Besides that just stepping through using GCC might help you.
NB that the GC marks both references from the stack and "global
rooted" objects, like code segments which might be used later.

GL.
-r
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.