Segfaults and memory leaks

Hi,

I’m not really sure where to start with this, but I experience a lot of
segfaults with ruby, in particular with respect to at_exit functions.

Here’s a simplified version of a function which often segfaults:

def self.stager_start deploy_mode, deploy_role, deploy_mock
# do some work here, including creating a lock in the database
at_exit do
# clear the lock
unless deploy_mock
# the next line segfaults
if deploy_mode != :unstaged
# this line is not reached
end
# nor is this one
end
end
end

It crashes fairly randomly and I don’t think I am going to be able to
produce a small test script to reproduce this error. I’m wondering what
my other options are.

I’m happy to try and do some debugging myself and am a capable
programmer but haven’t done much work on Ruby before so was looking for
somewhere to start.

Here’s a crash dump:

http://pastie.org/4158632

And here’s some output from gdb:

http://pastie.org/4158654

I have tried running valgrind and get a /lot/ of errors. I have run it
through several times now without seeing a segfault but I am under the
impression that anything valgrind lists is likely to be a bug anyway and
possibly related to the problems I’m having so I will include some of
its output here:

http://pastie.org/4158647

Also some information about my environment here:

http://pastie.org/4158642

So… Can anyone help point me in the right direction with this?

Thanks,
James

On Jun 27, 2012, at 00:18, James Pharaoh wrote:

   # the next line segfaults

my other options are.

I’m happy to try and do some debugging myself and am a capable
programmer but haven’t done much work on Ruby before so was looking for
somewhere to start.

Here’s a crash dump:

http://pastie.org/4158632

Can you reproduce it without libxml-ruby and zmq? It’s possible that
one of these C extensions is corrupting your process and causing the
crash. By reproducing the crash without these C extensions you can be
sure that it is a ruby bug.

libxml-ruby is often accused of leaking memory and causing crashes such
as these. Do the crashes disappear if you switch to nokogiri?

And here’s some output from gdb:

http://pastie.org/4158654

A crash while looking up a method is more likely to be due to memory
corruption by a C extension than a bug in ruby.

I have tried running valgrind and get a /lot/ of errors. I have run it
through several times now without seeing a segfault but I am under the
impression that anything valgrind lists is likely to be a bug anyway and
possibly related to the problems I’m having so I will include some of
its output here:

http://pastie.org/4158647

This is not necessarily a valid assumption. One of the reports may be
valid, but there’s a lot of noise in a valgrind report without running
it with the proper configuration for ruby.