Bus errors & segfaults

I’ve been getting daily Bus Errors on various ruby classes, which is
crashing our Mongrel server pretty much every day.

Here is an example error:

ruby(15833,0xa000ed88) malloc: *** error for object 0x41b20d0: pointer
being reallocated was not allocated
ruby(15833,0xa000ed88) malloc: *** set a breakpoint in szone_error to
debug
ruby(15833,0xa000ed88) malloc: *** error for object 0x41b20d0: pointer
being reallocated was not allocated
ruby(15833,0xa000ed88) malloc: *** set a breakpoint in szone_error to
debug
/usr/local/lib/ruby/1.8/pathname.rb:266: [BUG] Bus Error

Here is another one I get:

ruby(805,0xa000ed88) malloc: *** error for object 0x308ff20: pointer
being reallocated was not allocated
ruby(805,0xa000ed88) malloc: *** set a breakpoint in szone_error to
debug
ruby(805,0xa000ed88) malloc: *** error for object 0x308ff20: incorrect
checksum for freed object - object was probably modified after being
freed, break at szone_error to debug
ruby(805,0xa000ed88) malloc: *** set a breakpoint in szone_error to
debug
ruby(805,0xa000ed88) malloc: *** error for object 0x308ff20: pointer
being reallocated was not allocated
ruby(805,0xa000ed88) malloc: *** set a breakpoint in szone_error to
debug
/usr/local/lib/ruby/gems/1.8/gems/actionpack-1.12.5/lib/action_controller/rescue.rb:136:
[BUG] Segmentation fault

It’s running on OS X Server 10.4.8. 2GB RAM.

Any ideas?

Blake M. wrote:

I’ve been getting daily Bus Errors on various ruby classes, which is
crashing our Mongrel server pretty much every day.

Here is an example error:

ruby(15833,0xa000ed88) malloc: *** error for object 0x41b20d0: pointer
being reallocated was not allocated

A bus error can occur if there is misaligned data on certain processor
architectures, or if a physical address being accessed does not exist.
If the virtual address does not exist, a segmentation fault occurs. You
seem to be getting both.

In this case, I believe that these errors are indicative of a serious
software incompatibility issue, possibly between Ruby and its shared
libraries. For example, if Ruby calls an into extension shared library,
and the library functions are not at the expected addresses, you may get
this kind of problem. The .so file might be corrupted. The .so file
might have been compiled with different flags than Ruby was compiled
with. All these could cause issues.

  • Has this been happening right from the start?
  • When did you first notice this behavior?
  • Do you have any Ruby extensions that might have been compiled with
    different compiler flags to the ones used to compile Ruby?

Without more information, I would recommend recompiling all the Ruby and
third-party extensions C code, ensuring that they are all compiled with
the same compiler options. If you still get these issues, then more
information would be needed, such as the processor in your server
(PowerPC or Intel?), which compiler version you are using, which version
of Ruby, etc.

Good luck!

Blake M. wrote:

So, I apologize for posting to the Ruby list, it just seemed general
enough to be a Ruby error (and since one of the errors did not seem
Rails-related).

The developer is working hard on the next version of the OpenBase
bindings, so we should have a solution in a couple days.

Thanks for the lengthy input.

I believe no apology is necessary - it was an interesting question and
might have been somehow Ruby related, you never know. Thanks for letting
us all know what the likely cause is. I did not think of (but should
have) a buggy Ruby extension being the problem - I guess I am so used to
using ones that are very solid! I hope your application is up and
running reliably soon.

You don’t specify which hardware platform you’re running, but it does
seem as though you’re using a locally-compiled version of Ruby. Did
you build the openbase bindings yourself, or get a binary package from
your 3rd-party source? If it’s the latter, I’d make sure that they
built their bindings against precisely the same version of the Ruby
dev headers and library as yours.

Aside from that, I’d recommend seriously thinking about moving off OS
X as a production platform for Rails apps – I still have regular
problems with Bus Errors crashing my ruby processes on Mac OS, even
after building my own Ruby (and all my native extensions) from source.

Edwin F. wrote:

  • Has this been happening right from the start?
  • When did you first notice this behavior?
  • Do you have any Ruby extensions that might have been compiled with
    different compiler flags to the ones used to compile Ruby?

I believe the problem is actually due to the openbase bindings that we
are using for our Rails application. I’ve talked to the developer, and,
while the errors do not all occur in the openbase bindings/adapter, he
feels that the bindings 0.7.3 are actually corrupting Ruby’s memory, and
causing errors elsewhere.

These errors have been around since the beginning of development of our
ROR application, but seemed to subside when putting the app in
production. New to our setup was Mongrel as our application server,
which crashes whenever malloc throws an error like above. Webrick seems
to handle them without crashing (for the past couple hours anyway). So,
for now, we’re running Webrick instead of Mongrel.

So, I apologize for posting to the Ruby list, it just seemed general
enough to be a Ruby error (and since one of the errors did not seem
Rails-related).

The developer is working hard on the next version of the OpenBase
bindings, so we should have a solution in a couple days.

Thanks for the lengthy input.