Loopback dropping final bits

Same loopback code I emailed about earlier; this time I attached the
complete file (modulo some cleanup).

Here’s my input file (in stupid x86 short ordering…):

$ hexdump input.txt
0000000 bbaa ddcc ffee 1100 3322 5544 7766 9988

and then after going through loopback.py and being packed back to bytes:

$ hexdump output.txt
0000000 bbaa ddcc ffee 1100 3322 5544 7766 8088

For verification that the packing worked,

$ hexdump output.bin
0000000 0001 0001 0001 0001 0001 0101 0001 0101
0000010 0101 0000 0101 0000 0101 0100 0101 0100
0000020 0101 0001 0101 0001 0101 0101 0101 0101
0000030 0000 0000 0000 0000 0000 0100 0000 0100
0000040 0000 0001 0000 0001 0000 0101 0000 0101
0000050 0100 0000 0100 0000 0100 0100 0100 0100
0000060 0100 0001 0100 0001 0100 0101 0100 0101
0000070 0001 0000 0001 0000 0001 0000

Somewhere along the line, the last 5 symbols are being lost. I verified
this with an input file containing hex ‘FFFF’, and got this output (that
last zero byte is because hexdump is stupid).

$ hexdump output.bin
0000000 0101 0101 0101 0101 0101 0101 0101 0101
0000010 0101 0101 0101 0101 0101 0001

Any ideas as to why this happens?

Also: If I replace GMSK with DBPSK (:%s/gmsk/dbpsk/g will do it), I get
the following output:

$ hexdump output.txt
0000000 0100 aa1e 33ef bb77 00fc 8844 11cd 9855

instead of (0000000 bbaa ddcc ffee 1100 3322 5544 7766 8088 with GMSK)
with GMSK. Does DBPSK just suck, or do I need to be doing something
else? I tried using Tom’s new GMSK/DBPSK implementations but ran into
some interpreter errors trying to coerce the blks2 code into the flow
graph.

This may come back to some of the other problems people have been
emailing about regarding the last bits of data being lost when stopping
the flow graph…

-Dan

On Wed, Mar 28, 2007 at 09:39:14PM -0700, Dan H. wrote:

$ hexdump output.txt
0000000 bbaa ddcc ffee 1100 3322 5544 7766 8088

Regarding losing the last few symbols, try


graph.wait()
time.sleep(1)

Eric

Eric B. wrote:

and then after going through loopback.py and being packed back to bytes:
time.sleep(1)
No effect. I’d already tried it with the long pauses in there from
earlier… Oh, and I’ve also tried it with stop and wait in the other
order, and no stop at all… I’ll get some debug prints in tomorrow and
see if I can’t track this one down but I suspect the chain isn’t fully
emptying for some other reason.

-Dan

Eric B. [email protected] writes:

$ hexdump output.txt
0000000 bbaa ddcc ffee 1100 3322 5544 7766 8088

Regarding losing the last few symbols, try


graph.wait()
time.sleep(1)

I wonder if there is data somewhere in the flowgraph that’s less than
the amount needed for the next block to run. Perhaps there should
be some sort of drain operation, or query for this (that adds over
components), so one can find out what’s going on.

On Thu, Mar 29, 2007 at 08:04:22AM -0400, Greg T. wrote:

time.sleep(1)

I wonder if there is data somewhere in the flowgraph that’s less than
the amount needed for the next block to run. Perhaps there should
be some sort of drain operation, or query for this (that adds over
components), so one can find out what’s going on.

It should drain completely, unless some block is specifying an
output_multiple that cannot be specified.

Probably the easiest way to debug this it to enable logging in
gr_single_threaded_scheduler.cc. Do do that, change
the ENABLE_LOGGING define on line 37 to a 1.

This will create ascii file(s) named sst-N.log in the current directory.

Eric

On Thu, Mar 29, 2007 at 08:01:38AM -0700, Eric B. wrote:

On Thu, Mar 29, 2007 at 08:04:22AM -0400, Greg T. wrote:

I wonder if there is data somewhere in the flowgraph that’s less than
the amount needed for the next block to run. Perhaps there should
be some sort of drain operation, or query for this (that adds over
components), so one can find out what’s going on.

It should drain completely, unless some block is specifying an
output_multiple that cannot be specified.
^^^^^^^^^
satisfied

(Eric, sorry I keep failing to reply all)

Eric B. wrote:

On Thu, Mar 29, 2007 at 08:01:38AM -0700, Eric B. wrote:

On Thu, Mar 29, 2007 at 08:04:22AM -0400, Greg T. wrote:

I wonder if there is data somewhere in the flowgraph that’s less than
the amount needed for the next block to run. Perhaps there should
be some sort of drain operation, or query for this (that adds over
components), so one can find out what’s going on.

This is the final state of the flow graph:

<gr_block file_source (0)> source
noutput_items = 32767
general_work: noutput_items = 32767 result = -1
were_done

<gr_block bytes_to_syms (1)> regular 1:8
max_items_avail = 0
noutput_items = 8184
BLKD_IN
were_done

<gr_block interp_fir_filter_fff (2)> regular 1:2
max_items_avail = 4
noutput_items = 8190
BLKD_IN
were_done

<gr_block frequency_modulator_fc (3)> regular 1:1
max_items_avail = 0
noutput_items = 4094
BLKD_IN
were_done

<gr_block quadrature_demod_cf (4)> regular 1:1
max_items_avail = 1
noutput_items = 8182
BLKD_IN
were_done

<gr_block clock_recovery_mm_ff (5)> regular 2:1
max_items_avail = 9
noutput_items = 8191
BLKD_IN
were_done

<gr_block binary_slicer_fb (6)> regular 1:1
max_items_avail = 0
noutput_items = 32767
BLKD_IN
were_done

It appears that the elements are not draining, although I’m unclear on
why there is what seems to be 9 symbols remaining in the flow despite
the fact that I’m only missing 5 symbols. An EOF related problem?

I’ll keep looking…

-Dan

Dan H. wrote:

This is the final state of the flow graph:

<gr_block clock_recovery_mm_ff (5)> regular 2:1
max_items_avail = 9
noutput_items = 8191
BLKD_IN
were_done

I realized that the other two flow graph elements with max_items_avail >
0 have the same max_items_avail at initialization; this makes sense
since e.g. the filters may have more taps than data. However, the
clock_recovery_mm_ff block had max_items_avail=0 at program start. Also,
there was one step where it violated the correct 2:1 in/out by doing
31:15 instead; these two combine to 10 missing inputs on a 2:1 block, or
5 missing outputs. Aha!

[*** 48 items in, 15 out -> 18 left ***]
<gr_block clock_recovery_mm_ff (5)> regular 2:1
  max_items_avail = 48
  noutput_items = 8191
  general_work: noutput_items = 15 result = 15

...

[*** 30 items in, 18 already -> 48 left ***]
<gr_block quadrature_demod_cf (4)> regular 1:1
  max_items_avail = 31
  noutput_items = 8174
  general_work: noutput_items = 30 result = 30

[*** Only 47 available??? ***]
<gr_block clock_recovery_mm_ff (5)> regular 2:1
  max_items_avail = 47
  noutput_items = 8191
  general_work: noutput_items = 15 result = 15

Does this analysis make sense? Delving deeper into clock recovery now…

-Dan

Greg T. wrote:

I wonder if there is data somewhere in the flowgraph that’s less than
the amount needed for the next block to run. Perhaps there should
be some sort of drain operation, or query for this (that adds over
components), so one can find out what’s going on.

This appears to be the problem. For this configuration, the
gr_clock_recovery_mm_ff block requires at least 10 inputs - the
instrumented forecast method prints:

16382 inputs required for 8191 outputs
8194 inputs required for 4095 outputs
4100 inputs required for 2047 outputs
2053 inputs required for 1023 outputs
1030 inputs required for 511 outputs
518 inputs required for 255 outputs
262 inputs required for 127 outputs
134 inputs required for 63 outputs
70 inputs required for 31 outputs
38 inputs required for 15 outputs
22 inputs required for 7 outputs
14 inputs required for 3 outputs
10 inputs required for 1 outputs

Since this block initially starts with 0 inputs and is not filled with
other data following the run of the program, the last <10 samples are
left in the flow graph. This could well be the problem with the last
bits of the last packet always being dropped/corrupted, because if you
were using queues and enqueued a message followed by a close_queue
packet, they could well get shipped through the pipeline together.

This is where my expertise runs out; what’s the right fix? Is it
block-specific (either pre-pend 0s or detect the end of data and append
0s), or more general?

-Dan

On Thu, Mar 29, 2007 at 05:16:45PM -0700, Dan H. wrote:

22 inputs required for 7 outputs
This is where my expertise runs out; what’s the right fix? Is it
block-specific (either pre-pend 0s or detect the end of data and append
0s), or more general?

-Dan

Try applying this patch. It has the effect of preloading ntaps()-1
zero’s into the block’s input stream.

Eric

— gr_clock_recovery_mm_ff.cc 2006-12-20 10:40:57.000000000 -0800
+++ /tmp/gr_clock_recovery_mm_ff.cc 2007-03-29 17:59:42.000000000 -0700
@@ -61,6 +61,8 @@
set_omega(omega); // also sets min and max omega
set_relative_rate (1.0 / omega);

  • set_history(d_interp->ntaps());
  • if (DEBUG_CR_MM_FF)
    d_logfile = fopen(“cr_mm_ff.dat”, “wb”);
    }

Eric B. wrote:

This appears to be the problem. For this configuration, the
134 inputs required for 63 outputs
were using queues and enqueued a message followed by a close_queue
zero’s into the block’s input stream.

  • set_history(d_interp->ntaps());
  • if (DEBUG_CR_MM_FF)
    d_logfile = fopen(“cr_mm_ff.dat”, “wb”);
    }

No; this appears to at least shift and likely corrupt the bits [I got abba cdbc efde 01f0 2312 4534 6756 8878 instead of bbaa ddcc ffee 1100 3322 5544 7766 9988]. If I add an extra 0 byte at the end, these bits
come through, but then the last 5 of that next byte do not. Maybe the
flow graph needs to do something like:

At a disruption of continuity of the input (EOF, USRP goes into TX

mode, etc),
append enough 0 samples to clear the pipeline, reinitialize all (or
selected…) flow graph blocks
to their virgin state, and then continue.

The counter-argument is that when receiving symbols from a stream (like
off the USRP), this won’t be a problem since the source is continuous -
even if it flips into TX mode, if all samples of the data are there, the
symbols will be recovered when the pipeline restarts. Maybe that means
file sources violate some GNU Radio invariant…

-Dan

On Thu, Mar 29, 2007 at 06:27:24PM -0700, Dan H. wrote:

This is where my expertise runs out; what’s the right fix? Is it

 d_logfile = fopen("cr_mm_ff.dat", "wb");

}

No; this appears to at least shift and likely corrupt the bits [I got abba cdbc efde 01f0 2312 4534 6756 8878 instead of bbaa ddcc ffee 1100 3322 5544 7766 9988].

You know, the short format you’re using isn’t really helping seeing
what’s going on… See attached files for a suggestion. Or print
the output in binary.

The counter-argument is that when receiving symbols from a stream (like
off the USRP), this won’t be a problem since the source is continuous -
even if it flips into TX mode, if all samples of the data are there, the
symbols will be recovered when the pipeline restarts. Maybe that means
file sources violate some GNU Radio invariant…

Finite sources such as the gr.file_source are the corner case. I’m
not sure I want to mess with the underlying code over this. On the
other hand, we may see more of this when we embed flow graphs in
mblocks, and if so, we’ll need to revisit the “flush” and “reset”
options.

I’m open for discussion and patches that address this.

Eric