Re: Crash when handling exception coming from an signal hand

Masao_M · May 16, 2006, 2:02pm

I re-send this mail because sometimes my e-mail server rejected by
sf.net.

Hi,

Hmm, it doesn’t occur on my system.

% ruby test.rb
test.rb:10: undefined method `bloep!’ for main:Object (NoMethodError)
from test.rb:20
% ruby -v
ruby 1.8.4 (2005-12-24) [i686-linux]

Ruby-GNOME2 is latest CVS version,
with your rbgclosure patch.

Tell me the detail of your system what this problem occures.

If it’s on x86_64 system only,
I can’t test it … it may be a problem with incorrect casting anywhere.

On Fri, 12 May 2006 12:41:18 +0200
[email protected] (Sjoerd S.) wrote:

==10757==    by 0x5761382: g_signal_emit (in
==10757==    by 0x4B5F8A1: ruby_run (in /usr/lib/libruby1.8.so.1.8.4)
“It’s today!” said Piglet.
“My favorite day,” said Pooh.

–

.:% Masao M.[email protected]

Masao_M · May 16, 2006, 7:51pm

On Mon, May 15, 2006 at 12:22:17AM +0900, Masao M. wrote:

% ruby -v
ruby 1.8.4 (2005-12-24) [i686-linux]

Ruby-GNOME2 is latest CVS version,
with your rbgclosure patch.

Tell me the detail of your system what this problem occures.

If it’s on x86_64 system only,
I can’t test it … it may be a problem with incorrect casting anywhere.

I can reproduce it both on amd64 and on x86 (the one shown above was on
a
x86). I’ve got glib 2.10.2, which might also has something to do with
it?

Sjoerd

The questions remain the same. The answers are eternally variable.

Using Tomcat but need to do more? Need to support web services,
security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

Masao_M · May 16, 2006, 10:29pm

Hi,

I can reproduce it both on amd64 and on x86 (the one shown above was on a
x86). I’ve got glib 2.10.2, which might also has something to do with it?

Sjoerd

I’ve run your test program ang got the same error as you. Then I
dist-upgraded my system (I’m running unstable) and now the exception is
caught as expected (same message as Masao).

Since glib has not been updated meanwhile, I think the problem comes
from gtk : the new version on my system is now 2.8.17-1.

HTH

–
Mathieu B.
http://www.mblondel.org

Using Tomcat but need to do more? Need to support web services,
security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

Masao_M · May 17, 2006, 11:49am

On Mon, May 15, 2006 at 12:22:17AM +0900, Masao M. wrote:

% ruby -v
ruby 1.8.4 (2005-12-24) [i686-linux]

Ruby-GNOME2 is latest CVS version,
with your rbgclosure patch.

Tell me the detail of your system what this problem occures.

If it’s on x86_64 system only,
I can’t test it … it may be a problem with incorrect casting anywhere.

Ok, i’ve investigated this further. Glib signals are not written to
handle
exception handling via setjmp/longjmp or getcontext/getcontext over the
signal
code itself. So the current code can lead to interesting errors (like
the one
i encountered).

More specifically. When glib emits a signal it saves some pointer to
values on
the stack, then the handler is run. Now if we get an exception in the
handler
it will do JUMP_TAG, which jumps out of the signal code and eventually
causes
a part of the stack to be invalidated. But glib’s signal code still has
a
pointer to something on that part of the stack! So when the next glib
signal is
handled it will dereference to someting that’s no longer there… Which
is what
causes the crash i’m seeing on some machines, but if your lucky you
don’t
even notice it except when running valgrind (which is what i see on some
other
machines).

The only solution to this afaik is to not propagate the exception over
the
signal handling boundry. Which is exactly what for example the python
gtk
bindings do.

Attached patch does that. It just prints the relevant part of the
exception and
instead of propagating it.

Sjoerd

Masao_M · May 17, 2006, 1:30am

On Tue, May 16, 2006 at 10:23:10PM +0200, Mathieu B. wrote:

Since glib has not been updated meanwhile, I think the problem comes
from gtk : the new version on my system is now 2.8.17-1.

I’m also running the latest debian unstable on my machines. Could you do
me a
favour. Please install libglib2.0-0-dbg and run my test program under
valgrind.

Just skip all the initial output… Wait till it’s fully rendered, if you
then
can move the slider without any valgrind output it does indeed not occur
at all
on your machine. Otherwise it happens, but just doesn’t cause a crash

I’ve attached the relevant part of my valgrind log, so if you want you
can
check for differences if you get any output…

Sjoerd

Masao_M · May 17, 2006, 12:02pm

Hi,

I’m also running the latest debian unstable on my machines. Could you do me a
favour. Please install libglib2.0-0-dbg and run my test program under valgrind.
I don’t have time right now to compare my log with yours.

HTH,

Masao_M · May 17, 2006, 2:45pm

Hi Sjoerd,

On Wed, 17 May 2006 11:47:06 +0200
[email protected] (Sjoerd S.) wrote:

Ok, i’ve investigated this further. Glib signals are not written to handle
exception handling via setjmp/longjmp or getcontext/getcontext over the signal
code itself. So the current code can lead to interesting errors (like the one
i encountered).

The only solution to this afaik is to not propagate the exception over the
signal handling boundry. Which is exactly what for example the python gtk
bindings do.

Attached patch does that. It just prints the relevant part of the exception and
instead of propagating it.

OK. Applied. Thanks for your investigation and
explanation in detail. I could understand it well ;).

BTW, I’m waiting 2 your reply.

Could you reply them ?
I’ve not apply these patch yet.

http://www.ruby-forum.com/topic/65877#new
http://www.ruby-forum.com/topic/61419#new

–
.:% Masao M.[email protected]

Using Tomcat but need to do more? Need to support web services,
security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

Masao_M · May 17, 2006, 3:46pm

On Wed, May 17, 2006 at 09:41:29PM +0900, Masao M. wrote:

The only solution to this afaik is to not propagate the exception over the
signal handling boundry. Which is exactly what for example the python gtk
bindings do.

Attached patch does that. It just prints the relevant part of the exception and
instead of propagating it.

OK. Applied. Thanks for your investigation and
explanation in detail. I could understand it well ;).

Thanks !

BTW, I’m waiting 2 your reply.

Could you reply them ?
I’ve not apply these patch yet.

Re: [Patch] Support for G_TYPE_VALUE_ARRAY - Ruby-Gnome 2 - Ruby-Forum
http://www.ruby-forum.com/topic/61419#new

I’ve replied on one of them (about value array), didn’t know you were
waiting
for an answer.

For the glib thread signal problem, now i CVS works again i need some
time to
resync my stuff and i’ll send you a new patch with some debug info rsn.

Sjoerd

UFOs are for real: the Air Force doesn’t exist.

Using Tomcat but need to do more? Need to support web services,
security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642