GC in C extension

Hi all,

I’m working on a database driver for the Ingres database and I’m nearly
done but I’ve got one outstanding issue. Initially I thought it was a
bug in rb_ary_new but now I’m wonding if it’s a problem with garbage
collection.

I have several arrays that I declare globally in my C code. Each holds
a collection of VALUE objects that are created locally within the
routines. The contents of the arrays (the VALUE objects) will
eventually corrupt on a long running program. (If it matters, the
largest of these arrays is an array of arrays holding a result set.)

I’ve traced the corruption to a rb_ary_new call. On the line before the
rb_ary_new, the data within the array is good. On the line after,
trying to print the data causes a segmentation fault.

So my question is this: will the GC collect a VALUE object that lives
within a global array? I thought storing the local variables withing a
global variable would be a good enough reference to keep the data safe.

Any thoughts would be appreciated!

Jared
http://jaredrichardson.net

I think I’ve solved this… I assumed because the upper level arrays
were global that the contents would be safe. This was apparently not
the case. I added this line to protect the result set array

rb_global_variable(&resultset);

And it stopped corrupting. I think this call declares the variable to
the garbage collector so that it’s aware not to clean up the contents.

I had already moved the other variables (there were three smaller
arrays) into C arrays that I convert to Ruby arrays when the client
program calls for them. I did this rather than create the Ruby arrays
earlier in the program and let them sit, waiting to be retrieved by the
caller. It was during the “waiting” time that the corruption was
occurring.

Jared
http://jaredrichardson.net

In message [email protected],
[email protected][email protected] writes

I think I’ve solved this… I assumed because the upper level arrays
were global that the contents would be safe. This was apparently not
the case. I added this line to protect the result set array

rb_global_variable(&resultset);

And it stopped corrupting. I think this call declares the variable to
the garbage collector so that it’s aware not to clean up the contents.

I was going to answer this, but it seems you’ve got there before I could
answer. Putting your VALUEs into C global arrays does not automatically
make them visible to the Ruby GC, you do need to take some action (as
you have done) to make the Ruby GC aware of them to avoid unwanted
collection before you need them collected.

Stephen

Stephen K. wrote:

I was going to answer this, but it seems you’ve got there before I could
answer. Putting your VALUEs into C global arrays does not automatically
make them visible to the Ruby GC, you do need to take some action (as
you have done) to make the Ruby GC aware of them to avoid unwanted
collection before you need them collected.

Here’s a follow-on then… my program is now running great but leaking
lots of memory. :slight_smile: Because it’s a database driver, it doesn’t terminate
but rather runs for a very long time.

Declaring the variable as rb_global_variable prevents Ruby from garbage
collecting it. How do I release it back to Ruby again? Is there a
rb_unset_global_variable?

Thanks,

Jared
http://jaredrichardson.net

From: [email protected]
[mailto:[email protected]]
Sent: Monday, July 24, 2006 3:20 PM

Here’s a follow-on then… my program is now running great but leaking
lots of memory. :slight_smile: Because it’s a database driver, it doesn’t
terminate
but rather runs for a very long time.

Declaring the variable as rb_global_variable prevents Ruby
from garbage
collecting it. How do I release it back to Ruby again? Is there a
rb_unset_global_variable?

Not that I know of, but use a global array or hash and put everything
you want to keep in it. Inserting and deleting from a hash should
be even faster than rb_global_variable.

cheers

Simon

Kroeger, Simon (ext) wrote:

Declaring the variable as rb_global_variable prevents Ruby
Simon
Hi Simon,

In my C code I have a VALUE result_set that I set = ary_ary_new with
each call to the database. I put new VALUE objects in this result set.
After a bit, howevever, the VALUE objects I’m putting into the result
set become invalid. It appears that GC was reaping them.

Now that I’ve declared the result_set as an rb_global_variable, the bad
reaping has stopped… but now the memory is leaking. I assume it’s
from the orphaned variables from the last loop.

That make more sense?

From: [email protected]

from the orphaned variables from the last loop.

That make more sense?

That’s how I do it:
I declare a global Hash (just because it’s faster than an array) like

$keep_me = {}

(easiest way is to use eval to do that from c)

in this hash I put all the stuff my c part allocates and I want to
keep for later. (objects that are referenced from global containers
do not get GC’ed).

If I don’t need these objects anymore I remove them from the hash (and
the GC will wipe them away sometime)

So put your result set in such a container when you create it and
remove it if you handle it down to your user.

cheers

Simon

Kroeger, Simon (ext) wrote:

From: [email protected]

Gotcha… thank you Simon. It seems to be working perfectly. :slight_smile:

Here’s what I’m now doing on each pass through the code.

if(!keep_me) {
keep_me=rb_ary_new();
rb_global_variable(& keep_me);
}
// throw out the old data
rb_ary_clear(keep_me);

// store the new
rb_ary_push(keep_me, resultset);
rb_ary_push(keep_me, table_list);

etc…

Jared
http://jaredrichardson.net

Declaring the variable as rb_global_variable prevents Ruby from garbage
collecting it. How do I release it back to Ruby again? Is there a
rb_unset_global_variable?

You can use rb_gc_register_address() and rb_gc_unregister_address(). I’m
pretty sure that rb_gc_register_address() and rb_global_variable() are
actually the same function, but don’t quote me on that.

[email protected][email protected] writes:

In my C code I have a VALUE result_set that I set = ary_ary_new with
each call to the database. I put new VALUE objects in this result set.
After a bit, howevever, the VALUE objects I’m putting into the result
set become invalid. It appears that GC was reaping them.

How about “volatile VALUE result_set;”?

See: http://wiki.rubygarden.org/Ruby/page/show/GCAndExtensions