Ruby C extensions, callbacks, garbage collection, stack scanning

I found a strange, almost unbelievable bug in the way that the Ruby
garbage collector scans the stack:

(1) Ruby scans the stack using an imprecise pointer detector
(is_pointer_to_heap in gc.c)

(2) At the point where the crash occurs, it is scanning an
8192 byte buffer that happens to be on the stack somewhere
in the middle of libguestfs, and happens to contains random
data related to the upload operation. Ruby really should not
be scanning this buffer.

(3) The actual part of this buffer that is being scanned
(0xbfa71ce4) looks like a valid stack pointer. It isn’t – it’s
just some random data that happens to look like that.

(4) If we interpret this random data as a Ruby VALUE, it becomes
clearer what’s going on:

(gdb) print/x *0xbfa71ce4
$15 = 0xb6dcec98
(gdb) print *(RVALUE *)0xb6dcec98
$17 = {
as = {
free = {
flags = 98, # to Ruby this looks like T_DATA
data = {
basic = {
flags = 98,
klass = 3077482900
dmark = 0x49b0de10 <mark_load_arg>,
dfree = 0,
data = 0xbfa71ccc

(5) As you can see from the stack trace, Ruby follows the
completely bogus “data” pointer 0xbfa71ccc:

#9 0x49b0de33 in mark_load_arg (ptr=0xbfa71ccc) at marshal.c:841

and eventually this causes a crash.

I’m assuming this behaviour can’t possibly be intentional. In other
languages that I’ve used which have mark-sweep garbage collectors,
it’s normal that you have to mark parts of the C stack which need
scanning, instead of risking the above happening.

But I can’t find any documentation in Ruby about how one would mark
the stack. So how do I avoid Ruby scanning bits of the stack that are
buffers or contain other coincidental values?