Segmentation fault due to thread unsafety?

Detlef_R · July 4, 2015, 8:46am

Hi all,

I suspect I’m having problems with different threads which gives me a
segmentation fault. Hopefully someone with knowledge on how this works
with GNU radio can give me a hint for a solution.

What I did is as follows (C++ implementation code is quite long, so I
hope the following sketch of the situation is sufficient):

I made a pure C++ test-application, so not using GRC.
I derived my own block from the sync-block.
In the work()-function of my own block, variables are prepared to be
plotted (simple xy-graphs). These variables are members of my
block-class.
My own block also has a member function to plot the xy variables, say
void plot_all(void).
In the main loop (so within main()…) there is a loop which waits
for CTRL-C, and lets the GNU Radio thread sleep for 10 ms each
iteration. In that loop, the plotting routine is called when more than
100 ms elapsed since the last time: my_block->plot_all().
If the plot_all() function now not only reads or only writes a member
variable of my own block but both reads and writes, then I get a
segmentation fault.
I even get a segmentation fault if the plotting variables are made
global, so outside my own block.

I suspect that a solution is that I drag all plotting into my own block,
and that also calling the plot_all() function must be done from within
the work() function (say every 100 ms).

a) Would that work? (if I would try and test it and it works I may still
not know if the solution is foolproof, so I ask therefore)
b) Does that give me additional chance for buffer overruns because of
required plotting CPU time within work()?
c) Do I have to pull GLUT initializations as well to the constructor of
my own block, or can I leave them in main()?
d) Is there a better solution?

Sorry if I ask much, but I’ve spent many hours on this problem already
and an expert may just have a simple solution for me

hanks and best regards,

Jeroen

unknown · July 4, 2015, 9:30am

Hi Jeroen,

I suspect that a solution is that I drag all plotting into my own
block, and that also calling the plot_all() function must be done from
within the work() function (say every 100 ms).
I’d say quite the opposite is true!

So the point is that GUIs need to have their own loops to update the
display, so the GUI routines inherently run in a different thread than
your block. What you’re doing by plotting in your work() is that you
change the state (e.g. the buffer that is displayed on screen) that the
GUI thread works on.
Since changing that state is, like you’ve guessed, usually not
thread-safe, you get bad effects.

Typically, you let your GUI toolkit define callbacks or functions that a
programmer must overload to fill a specific framebuffer with image data,
when the toolkit calls it.

The trick is hence to not do anything graphical in your work(). Instead,
over some thread safe mechanism (e.g. Queue with a mutex, or if you’re
using Qt, Signals) copy the data over to your GUI logic. In that logic,
take the data and plot it – but that has to be done from within the GUI
toolkit!
E.g. some GUI toolkits let you override a “paint” method to draw
something on the screen. Do that, based on the data you get from a
Queue.
That paint method will automatically be called by the GUI when e.g. the
window size changed. But you want to update every N samples, so after
passing the data to the Queue, the work() calls an “invalidate” function
from the toolkit – these are typically thread-safe, and trigger
updating of the screen, and in turn call the “paint” method.
I’d point you to gr-qtgui, but I think it’s fair to mention that you
should first read up a bit on Qt signals and slots – its source code
can be hard to understand, otherwise.

Best regards,
Marcus

unknown · July 5, 2015, 11:26am

Hi Marcus,

Thanks for your extensive reply. From your reply I am not sure if you
understood that I am not doning anything graphical in work(). In very
raw pigeon code this is what I have:

class my_block_class : sync_block {
public:
vector plot_x_data, plot_y_data;

 work()
 {
      // samples processing.

      // Sometimes prepare plot_x_data and plot_y_dataand set a

flag.
}

 plot_data(*some_plot_class)
 {
     some_plot_class->plot(this->plot_x_data, this->plot_y_data);
 }

};

display_handler()
{
my_block->plot_data(&plot_class)
}

main()
{
// Many inits.
// Start GNU Radio processing.

  while (true) {
      sleep(100m);
      handle_GLUT_events;   // This indirectly calls the

display_handler and thus calls the plotting routine of my block…
}
}

Since my post and your reply I tried several things in changing the
code, what supprised me was the following:

Sometimes I got an error about memory corruption or a double free,
that’s looks like a more fundamental problem than simply different
threads access.
I locked the few member function’s that access the plotting data with
boost::lock_guard to prevent multiple threads accessing the data
simultaneously, but the segmentation fault remains. Again an indication
that it is not a thread problem but another memory issue.
If access to the plotting data remains, but plotting itself is
ommitted, then the segmentation fault dissapears!
If I only read the plotting data during plotting and do no writing at
all to this data, then there is no segmentation fault (and plotting
works…).
If I also write to the plotting data in my plot function (I ‘need’ to
set a flag that the data was plotted so replotting is only done in case
of new data in order to save CPU time), then the segmentation fault
comes back again.

I’ll see if I can find out what the underlying problem is, then I’ll
post again.

Best regards and thanks for your support,

  Jeroen

unknown · July 6, 2015, 9:51am

Hi Jeroen,

oh, ok, then I misunderstood, sorry

Have you been able to track down the exact place where the segfault
happens? Maybe the GDB tutorial might not be the worst start:
https://gnuradio.org/redmine/projects/gnuradio/wiki/TutorialsGDB

You might want to set breakpoints at the position where you write data
to your buffer and where use that buffer to plot. Use something like
“print variablename” to make sure the addresses and variables are really
what you expect them to be.

Best regards,
Marcus