Source block question: internal sleep loop or let GR scheduler handle empty read?

Monahan-MitchellS_T · May 7, 2013, 12:12am

Looking at some source block examples (audio jack, comedi, file source,
etc.), sometimes I see a sleep() called inside the work function when
there is no data to read, sometimes not (noutputitems = 0).

For a real file, not sleeping and returning noutputitems = 0 makes
perfect sense.

For a stream device, using fread(), what are the pros/cons of the work
loop sleeping if fread is empty, or letting the scheduler decide?

If I can influence the implementation of the device driver, similar
question on pros/cons if the driver sleeps (or schedules) instead if
there is no data yet, or just returns an empty read?

(ignore endless looping for these questions, that could be detected)

I will be using GR v3.7 when available.

Thanks in advance,
Tim

Monahan-MitchellS_T · May 7, 2013, 3:48am

Hi Tim - My vote is to let the scheduler decide.

In GNU Radio’s “thread per block” scheduler, I think either will work
correctly but I tend to believe returning 0 is preferred such that the
scheduler can determine what to do next.
In the older GNU Radio “single threaded scheduler” you’d want to have
“work” return 0 and let the scheduler figure out how to handle it – so
as to allow other block’s “work” to make progress during the waiting; if
any “work” blocks in sleep or IO, all processing stops until the block
is lifted.
Some alternative scheduler, maybe Josh’s GRAS, could separate threads
and blocks by using a cluster of threads and a smart queue to handle
block execution in a combined FIFO / priority manner. In this case, the
block should return 0 and let the scheduler decide, such that the “work
processing” thread can be used for some other block’s “work”.

When I look at these 3 primary ways of handling the “work”, 2 of 3
should return 0 and not sleep or otherwise block, and the 3rd one can go
either way. The basic idea is to keep “work” happening somewhere in the
waveform/graph, by disallowing sleep or blocking IO.

Hence, my vote to let the scheduler decide. I hope this helps; someone
please correct me if I’m wrong somewhere or if there are other
interesting ways to look at this issue. - MLD

Monahan-MitchellS_T · May 7, 2013, 3:30pm

On Mon, May 6, 2013 at 9:47 PM, Michael D. [email protected]
wrote:

Hence, my vote to let the scheduler decide. I hope this helps; someone please
correct me if I’m wrong somewhere or if there are other interesting ways to look
at this issue. - MLD
Yes, leave it to the scheduler. You don’t want to be sleeping in a
block, unless that’s part of its normal function.

You also don’t want to have a blocking call inside of a work function,
unless it can be interrupted.

Tom

Monahan-MitchellS_T · May 7, 2013, 4:58am

In GNU Radio’s “thread per block” scheduler, I think either will
work correctly but I tend to believe returning 0 is preferred such
that the scheduler can determine what to do next.

You can technically block forever as long are you are either in a boost
interruptible call or periodically checking for boost thread interrupt
stuff. This is because flow graph stopping performs boost thread
interrupts.

I dont recommend this. I think its bad practice, it ties stuff to boost
threads, and half the dang versions of boost have somehow managed to be
released with broken interrupt functionality. And no OS ever patches
these, its always an update for a future OS… just gripes

Some alternative scheduler, maybe Josh’s GRAS, could separate
threads and blocks by using a cluster of threads and a smart queue to
handle block execution in a combined FIFO / priority manner. In this
case, the block should return 0 and let the scheduler decide, such
that the “work processing” thread can be used for some other block’s
“work”.

Basically you can have an arbitrary number of thread pools with an
arbitrary number of threads in a pool handling and arbitrary number of
blocks aka actors; GRAS can potentially have part of the flow graph in
TPB mode and and another in STS mode. Fun stuff!

So if some of your blocks steal all the threads away in a particular
pool, you can end up with starvation for the other blocks in that pool.
Of course maybe for some topology its ok if a provider steals away
threads from a consumer, after all, whats the consumer going to do with
it if the provider has nothing to produce. But if you start stealing
threads from other providers that are on equal footing in the topology,
you will run into trouble.

But the major downside to stealing the thread context for too long is
that the block has other things besides work related events to handle.
Consider setting properties, reading properties, reacting to topology
changes, responding to a status query… Its the equivalent of locking a
mutex and blocking in work while some external entity tries to lock the
same mutex to set a property – like FIR taps.

When I look at these 3 primary ways of handling the “work”, 2 of 3
should return 0 and not sleep or otherwise block, and the 3rd one can
go either way. The basic idea is to keep “work” happening somewhere
in the waveform/graph, by disallowing sleep or blocking IO.

Hence, my vote to let the scheduler decide. I hope this helps;
someone please correct me if I’m wrong somewhere or if there are
other interesting ways to look at this issue. - MLD

The real challenge is that the scheduler doesn’t know how to do the
polling on many arbitrary “things” in a generic way. So the block does
it, which is a little suboptimal: This nicely keeps the abstraction of
the blocking in the user’s work function. But always checking a
condition and returning zero, means that at least of of the threads is
spinning with little useful work. You can mitigate this with some
sleeping or timeout – which may be undesirable with STS.

Heres my thoughts: No work routine should ever block the work function.
This makes most if not all models of the scheduling universe happy. I
prefer that in all cases source work routines use some sort of poll with
a timeout of 10ms. Source blocks should not be put into thread pools in
which they can cause work starvation of other blocks: threads_in_pool >=
num_source_blocks + (num_other_blocks? 1 : 0)

Dont like 10ms? Most people do, but thats ok. What if the scheduler
provided an API to let the block know how long its allowed to steal the
thread for. Then you can be smart an adjust the number based on
down-time vs activity heuristic. Probably something similar to the
algorithm for reactive spin locks:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.63.3516

Although, you will probably just hard code the API call to 10ms, because
thats the best number to use.

-josh