Top block segfaults

Hey there again,

Usually when I get segfaults, it’s because my own C++ code does
something stupid. When I run it through the debugger, it always lets
me know where the problem is in the C++ code (that’s how I know it’s the
C++ code causing the problem). Imagine my surprise as I got something
to segfault in Python.

For some reason the top block code is causing a segfault for me, and I
don’t know why. Is there anything in top block that could cause this?
Or, is it more likely likely a problem with my own python code?

Here’s the debug output:

→ tb.run()
(Pdb) step
–Call–

/usr/local/lib/python2.5/site-packages/gnuradio/gr/top_block.py(50)run()
→ def run(self):
(Pdb) step
/usr/local/lib/python2.5/site-packages/gnuradio/gr/top_block.py(51)run()
→ top_block_run_unlocked(self._tb)
(Pdb) step
–Call–
/usr/local/lib/python2.5/site-packages/gnuradio/gr/gnuradio_swig_py_runtime.py(1577)top_block_run_unlocked()
→ def top_block_run_unlocked(*args):
(Pdb) step
/usr/local/lib/python2.5/site-packages/gnuradio/gr/gnuradio_swig_py_runtime.py(1579)top_block_run_unlocked()
→ return _gnuradio_swig_py_runtime.top_block_run_unlocked(*args)
(Pdb) step
Segmentation fault (core dumped)

Any help you could give me would be greatly appreciated.

-Ben


Put your friends on the big screen with Windows Vista® + Windows Live™.
http://www.microsoft.com/windows/shop/specialoffers.mspx?ocid=TXT_TAGLM_CPC_MediaCtr_bigscreen_012008

On Thu, Jan 17, 2008 at 12:36:38AM -0700, beezle bub wrote:

Hey there again,

(Pdb) step

-Ben

Ben,

it would be easier to figure out if you let us know the code
you’re trying to run, and what it looks like from the command line.
This could be the “failing to trap exception problem”, but from what
you’re showing us I can’t tell.

Also, GDB does a better job on our mixed C++/python code than pdb.
Directions on running it in
http://www.gnu.org/software/gnuradio/doc/howto-write-a-block.html#debugging

Eric

beezle bub wrote:

For some reason the top block code is causing a segfault for me, and
I don’t know why. Is there anything in top block that could cause
this? Or, is it more likely likely a problem with my own python
code?

We currently have an open bug in the trunk and stable branches:

http://gnuradio.org/trac/ticket/181

This, on the systems that exhibit it, will turn (some) exceptions thrown
in swig wrapped C++ into aborts, even when the code is there to catch
the exception and handle it gracefully.

Since the new flow graph code in 3.1 is all done in C++ (in anticipation
of allowing pure C++ GNU Radio applications in 3.2), if there is a
misconfigured flowgraph in the user’s Python code, the C++ runtime will
throw an exception and the handler will report it with a helpful
message. But, because of ticket:181, it simply becomes a segfault
instead.

That’s probably what you’re seeing here.

To track down the original problem, execute Python and your script
entirely from gdb. When you get the abort, do an ‘info threads’ and
‘thread apply all bt’.

On the faulting thread, a couple of stack frames from the top, will be
the GNU Radio code that threw the exception. Switch to that frame and
list the code, you’ll see the conditional test and the proper error
message.

In your case, the function running was tb.run(), which invokes numerous
checks on the flow graph topology before handing it to the scheduler, so
one of these checks is probably failing.

Ticket:181 is our highest priority bug at this point, and after many
hours of tracing through swig generated code and x86 assembly, we think
it is either a swig bug or (less likely) a gcc bug. The conditions (gcc
version, swig version, 64-bit vs. 32-bit, Python version, specific
exception) that trigger it are not consistent either, but when it does
happen, it happens reliably.

Eric and I have spent many hours trying to tackle this one–any help is
appreciated.


Johnathan C.
Corgan Enterprises LLC
http://corganenterprises.com