Re: How to reduce reconfiguration latency

Detlef_R · April 23, 2014, 5:24pm

Hi Marcus,

Thanks for reply. Let me try to answer the questions.

Yes. The signal frequency shifted by +/- 5%.

The transfer burst is constant.

The frequency shift remains constant during a burst. So I estimate the
shift in the beginning of a burst, and fix the setting of demodulator
and
resampler over the whole burst. I use FFT and interpolation to estimate
the
frequency shift.

The “estimate frequency shift then configure demodulator” approach
works,
except it takes 600-700 ms for the demodulator to start outputting
meaning
data after receiving correct configuration. I call this 600-700ms time
the
reconfiguration latency.

Maybe I used wrong term. By reconfiguration I mean I made two function
calls: one is the set_frequency of signal source, the other is
set_resamp_ratio of the fractional resampler. My data rate is 44.1k. So
I
lost about 30k samples when I was waiting for the reconfiguration to
take
effect.

Regards,
Bolin

Bolin_H · April 24, 2014, 2:12am

I wish to report my findings on the idea of pausing the flow graph, and
hope to get feedback from the experts on this list. I only started using
GNU radio last month so it is likely I didn’t have enough experience to
understand it properly.

My understanding of the flow graph execution was a scheduler checks the
blocks in a round robin fashion, and execute the work function of a
block
if the block can make progress. So my intention was to add functions to
the
scheduler to allow pause and resume from python via top block. When
paused,
the scheduler won’t run any block’s work function.

I looked into the scheduler source code to find ways of implementing
pause
and resume. It seems very hard to pass the intention to pause or resume
to
the scheduler. Specifically, top_block_impl creates a scheduler_tpb, and
scheduler_tpb creates a thread using the functor thread_body_wrapper.
The
thread_body_wrapper contains another functor tpb_container. In
tpb_container’s
() operator, a tpb_thread_body is constructed. So the way
scheduler_tpbruns a block is by creating a thread that runs the
constructor of tpb_thread_body through two levels of functors. The
tpb_thread_body
constructor contains an infinite loop. In every iteration of the loop,
the
work function of the block running on the thread is called if
permissible.
So tpb_thread_body seems to be a good place to pause the flow graph, and
I
need to send the intention to pause or resume to tpb_thread_body. The
intention can easily go from python to top_block to top_block_impl to
scheduler_tpb. But the tpb_thread_body is created by the thread library
and
not readily accessible by scheduler_tpb. Furthermore, there are two
level
functor indirection between scheduler_tpb and tpb_thread_body.

I did find two existing ways of passing information to tpb_thread_body:
thread interrupt and message passing. But they don’t seem to be the
proper
ways of sending the pause and resume information.

Any suggestion is welcomed.

Thanks,
Bolin

Bolin_H · April 25, 2014, 5:38am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Bolin,

my compliments for being this investigative on your issue!

My understanding of the flow graph execution was a scheduler
checks the blocks in a round robin fashion, and execute the work
function of a
block
if the block can make progress.

Not really. As you notice further on, the scheduler is called “tpb”,
which stands for “thread per block”.
What happens is that every block is executed in its own thread, which
sleeps until one of the neighbor blocks notices the scheduler that he
has done some work and the scheduler notfies the block thread that it
can continue.

So my intention was to add functions to the scheduler to allow
pause and resume from python via top block. When
paused,
the scheduler won’t run any block’s work function.

Interesting, yet I don’t understand why you would want to do that. How
does that paused time help you reduce your latency?
If I understood you correctly, the number of samples until your
frequency offset estimator has come to an estimate plus the number of
samples that are already in the pipeline between that estimator and
the frequency correction is very large (44.1kHz*0.6s =~ 30,000). I
still blame that on the estimator and I don’t think you can solve
that “information theory sampling time” latency issue by stopping
“reality processing time”…

You can try to reduce the number of samples that are being worked on
each time (set_max_noutput_items) or even limit the maximum buffer
size between two blocks (set_max_output_buffer_size), which both might
hurt you. Please have a look at
GNU Radio Manual and C++ API Reference: Main Page .

Anyway, have you had a look at gr::top_block::lock()? It’s meant to
stop the flowgraph prior to reconfiguration (which in “official” GR
lingo means you dis- and reconnect some blocks). I think there was a
discussion yesterday about if that empties buffers, and I think it
doesn’t touch buffers of connections you don’t change. I’m not quite
sure about that, though.

Greetings,
Marcus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTWNP0AAoJEBQ6EdjyzlHt+NgH/jSLjf3WD60pZKg9NqMp72LL
OFrblVt9/f3ashUgzOgaPFUcMSV75c/f+1V8L/wVhdNS6WCx3P9r9JD5Y4Xs3z8A
TiG1UACngyRpomqI8TbGn929mFgi5nkBGQQ6F2gFmzv68LzToFifC4mwBH5jtA1J
ELUQDXmcymfKdjuUEeiOLin5CeXnXueUqbfPRVA5ouEV1XBoXMyhP5r7XY5ME1Fa
zyZuNIUubWUnYWRFjlhAuTLwpFjMWQc0MwmSxw0eLRlfuVYubGOJZm1TFNSbTCGm
O8KL/DhIE5cKagmOg57xO+9QS3u3givrGDjBPa++dkdONXCFsDeMZCaYncfjPWQ=
=29EU
-----END PGP SIGNATURE-----

Bolin_H · April 25, 2014, 6:14am

I found pausing the flow graph to be the wrong action for my situation.

I tried to find out how many samples are in my flow graph. I inserted
logging to the work function of two blocks in my flow graph. One is the
signal source whose frequency is altered if a frequency shift is
detected.
The other is the last block of my flow graph. The log prints the value
of
nitems_written() of the blocks. The difference of the items written
between
these two blocks is roughly equal to the samples I lost waiting for the
new
setting to take effect. So I think the latency is the time it takes to
drain the samples that has gone too far down the flow graph to be
affected
by the new setting.

Obviously pausing the flow graph won’t help. Instead, it would be
helpful
if I could speed up draining the samples that can’t be affected by the
new
setting, something like flowgraph.clear_downstream(basic_block_sptr
start_block), which discards all samples in blocks that is reachable
from
start_block’s output ports according to the flow graph. The time spent
on
moving and processing useless samples could be saved. Does this idea
make
sense?

Thanks,
Bolin

Bolin_H · April 25, 2014, 6:54am

Try decreasing the buffer size in gnuradio-runtime/lib/flat_flowgraph.cc
I use:
#define GR_FIXED_BUFFER_SIZE 2048

Bolin_H · April 25, 2014, 7:02am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Bolin,

my compliments for being this investigative on your issue!

My understanding of the flow graph execution was a scheduler checks
the blocks in a round robin fashion, and execute the work function
of a
block
if the block can make progress.

Not really. As you notice further on, the scheduler is called “tpb”,
which stands for “thread per block”.
What happens is that every block is executed in its own thread, which
sleeps until one of the neighbor blocks notices the scheduler that he
has done some work and the scheduler notfies the block thread that it
can continue.

So my intention was to add functions to the scheduler to allow
pause and resume from python via top block. When
paused,
the scheduler won’t run any block’s work function.

Interesting, yet I don’t understand why you would want to do that. How
does that paused time help you reduce your latency?
If I understood you correctly, the number of samples until your
frequency offset estimator has come to an estimate plus the number of
samples that are already in the pipeline between that estimator and
the frequency correction is very large (44.1kHz*0.6s =~ 30,000). I
still blame that on the estimator and I don’t think you can solve
that “information theory sampling time” latency issue by stopping
“reality processing time”…

You can try to reduce the number of samples that are being worked on
each time (set_max_noutput_items) or even limit the maximum buffer
size between two blocks (set_max_output_buffer_size), which both might
hurt you. Please have a look at
GNU Radio Manual and C++ API Reference: Main Page .

Anyway, have you had a look at gr::top_block::lock()? It’s meant to
stop the flowgraph prior to reconfiguration (which in “official” GR
lingo means you dis- and reconnect some blocks). I think there was a
discussion yesterday about if that empties buffers, and I think it
doesn’t touch buffers of connections you don’t change. I’m not quite
sure about that, though.

Greetings,
Marcus
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTWMPTAAoJEBQ6EdjyzlHtoqcH/RzAKsfHAL2adT4D1rsof9G3
11XrkUL0//yCopT+7zawKnUKtUNJJ+6/Eu3DG23mWR/irqTYBCPbM/CTOsS6m7bz
ZlRFBtHJslNLk0426QPA7IOisvy1O23ESk4eR5enxDRfHlp5K6cMXYRDuOZ8k9hi
Ur7oQUmwKQSP9NObidc0odnDRfz9lkS1RWW+n4sSX3hA3tFpkrJnlrGyW/dQ9Awp
SUBHhreS+skXSXvYAMwB0U6eWLKk0+EQEeec5R6zqQ2tuBO5UcoXE2OJuGYPWaJt
EVoB4ZuKOgIntjPiRMNV7xus2hyu5eBzxvQoY7W0E9RGXBcxRp4nWH0jjjpbhYA=
=ZzcD
-----END PGP SIGNATURE-----

Bolin_H · April 25, 2014, 3:55pm

On Thu, Apr 24, 2014 at 10:14 PM, Vanush V. [email protected]
wrote:

Try decreasing the buffer size in gnuradio-runtime/lib/flat_flowgraph.cc
I use:
#define GR_FIXED_BUFFER_SIZE 2048

There are a number of hooks to each block to set various values to help
you
control stuff, like set_max_output_buffer. See the docs for more:

http://gnuradio.org/doc/doxygen/classgr_1_1block.html

Just be careful using these! They are advanced functions since you’re
messing with the scheduler behavior.

Tom

Bolin_H · April 28, 2014, 3:49pm

On Fri, Apr 25, 2014 at 7:06 PM, Bolin H. [email protected] wrote:

(2) Set max_noutput_items by tp.run(2048). The latency stayed about the
starts from 0 again after the 5 seconds blank. So I suspect the samples in
some blocks were lost during unlock. Anyway, the 5 second restart time
means this is not the right solution to my problem.

If you aren’t disconnection/reconnecting blocks in a flowgraph, there is
no
need to lock and unlock the flowgraph.

Tom

Bolin_H · April 26, 2014, 1:07am

Thanks for suggestions of Vanush, Marcus, and Tom. I am happy to report
that I was able to reduced the wasted samples from about 30k to about
1.5k.

Here’s what I tried:

(1) #define GR_FIXED_BUFFER_SIZE 2048 in flat_flowgraph.cc. This gave me
the result reported above. The down side of this approach is I needed to
re-build and re-install GNU radio. What about some hook to set this at
start up time?

(2) Set max_noutput_items by tp.run(2048). The latency stayed about the
same. So limiting the number of output items alone isn’t enough to
reduce
the number of samples in the flow graph.

(3) tp.lock() and tp.unlock(). My application stopped working after I
made
these calls because the time it took to unlock the flow graph is longer
than transferring the burst of data. I only changed the frequency of a
signal source and interpolation ratio of a resampler. I didn’t make any
disconnection or connection. From the log, it seems the whole flow graph
was restart. There was ~5 second blank without any log before the flow
graph starts to run again, and the signal source’s nitems_written()
starts
from 0 again after the 5 seconds blank. So I suspect the samples in some
blocks were lost during unlock. Anyway, the 5 second restart time means
this is not the right solution to my problem.

(4) block.set_max_output_buffer(). I tried to set this value to a couple
of
the blocks, and found it not sufficient to reduce the number of wasted
samples. There are too many blocks in my flow graph to set them all, and
maintaining them might be a problem. So I decided to abandon this
approach.

Thanks,
Bolin