Buffer Overflow Debug

Detlef_R · May 15, 2015, 12:27am

Hi all,

I’m working on an incredibly annoying issue related to my use of the
Header/Payload Demux (HPD) block. I think it’s related to a buffer
overflow
at some point, but I’m having a really hard time coming up with a proper
debug strategy to nail this down.

What I’m seeing is my data streams freeze after the input to the HPD
block,
both on the header branch and the payload branch. Everything before the
HPD
block continues on without issue. The time it takes the streams to
freeze
is HIGHLY variable. I’ve watched it run for 30 minutes straight before a
freeze and I’ve watched it freeze a few seconds after start. I’m using
tags
generated by the Correlation Estimator as the trigger for the HPD block.

My question is this, if I suspect a buffer overflow is causing a freeze,
how would I prove this to myself?

v/r,
Rich

reback · May 15, 2015, 1:06am

On 14.05.2015 15:26, Richard B. wrote:

streams to freeze is HIGHLY variable. I’ve watched it run for 30 minutes
straight before a freeze and I’ve watched it freeze a few seconds after
start. I’m using tags generated by the Correlation Estimator as the
trigger for the HPD block.

My question is this, if I suspect a buffer overflow is causing a freeze,
how would I prove this to myself?

Rich,

a “buffer overflow” wouldn’t cause GR to freeze, rather, it would crash.
Going by your previous messages, I suspect what you’re seeing is that
the HPD is starting to block, causing backpressure until that in turn
reaches the source. (Correct me if I’m wrong).

I remember you previously mentioning something similar. Did you confirm
the header parser is actually sending out a message for every data
packet it receives? This is a case where the HPD is actually designed in
a way that it’ll fail.

As a debugging strategy, I would recommend printing out the state
changes inside the HPD state machine. If it freezes, it would be
interesting to see in which state that is.

Cheers,
Martin

reback · May 20, 2015, 8:03pm

Hi Martin,

Sorry for the delay in response. We have been able to put some time into
debugging this issue and here is what we’ve found:

We have confirmed that the input stream to the HPD block is correctly
tagged when the block freezes. The tags we set as “trigger tag” in the
block are on the input stream.
At some point in the flowgraph operation, the HPD block gets stuck in
the STATE_FIND_TRIGGER state (idle state). This is even though the
trigger
tags are present on the input, as confirmed in 1 above. We are observing
that get_tags_in_range is failing to find the tags in the stream. We
can’t
figure out what would cause this, or that this is even the issue. It’s
the
best idea we have.

So we agree there is no buffering issue. The issue is with the HPD block
not seeing tags that are confirmed to enter the input port. We would
love
some help on this issue.

The way we are debugging, is we copy and pasted the built in HPD block
source to our own custom gr_modtool module and added cout debug
statements
there.

v/r,
Rich

On Thu, May 14, 2015 at 4:05 PM, Martin B. [email protected]

reback · May 21, 2015, 3:28am

Yes I will file a bug.

We put a tag_debug block right before the HPD input. The tag debug
stdout statements continue, while the get_tags_in_range function of the
HPD block returns none.

We will continue debugging this to see if we can learn something more
concrete. The time it takes the flow graph to generate this error is
very variable. From hours to seconds.

If there is other useful debug info you’d like us to generate please let
me know and we will.

V/r,
Rich

reback · May 21, 2015, 3:06am

This is interesting, and kinda serious. Also, we’ve had reports that
tags go missing in the past, but it’s something that’s hard to verify.

How did you confirm the input stream is correctly tagged? If
get_tags_in_range isn’t finding tags that it should, that is most likely
the issue you’re seeing here.

Can you please file a bug report on the issue tracker, with as much
detail as you can provide?

On a side note, you might be able to use the trigger input instead of
the tags, but that doesn’t solve this problem.

Thanks,
Martin

reback · May 27, 2015, 2:46pm

Hey Richard,

I’m going to need some time, peace and quiet to get to the bottom of
this. Which means not in the very near future But I’m very interested
in getting this to work. Can you post some failing codes/OOTs somewhere
for me to start testing?

M

reback · May 22, 2015, 9:23pm

Martin et. al.,

I implemented my own block that produces a trigger signal when a
specific
tag is encountered, and I feed this into the detect port of the HPD
block.
This is an attempt to overcome the tag trigger issues we’ve been having
in
this email chain. The trigger_tag block simply outputs zeros until a
specified tag_key is found, at which point it outputs a 1 on that sample
and then continues outputting zeros. It has the same trigger behavior as
the Schmidt Cox block used to create the detect signal in the OFDM
example.

This has resolved the HPD block freezing issue, however, a new issue has
arisen. Now the block will output the correct header and payload for a
little while, and then randomly start passing the wrong header portion
through, even though the detect signal is still on the correct trigger
point at the input to the HPD block. I confirmed this by comparing the
HPD
port0 input with the HPD port1 input, confirming the detect peeks
correspond exactly with the port0 tags that signify a header start. Even
while the HPD block allows the wrong header portion through, the detect
signal is aligned perfectly with the header start at input0.

If you wouldn’t mind confirming my HPD block settings, I will detail
them
here. I don’t think this could be the issue, because I expect the block
would never work correctly if they were off, but better to be safe. The
input0 type is complex unpacked data. My header is 32 bits long packed.
I
correspondingly set the following:

Header Length (Symbols): 4Items per Symbol: 8

Guard Interval (items): 0

Output Format: Items

IO Type: Complex
Sampling Rate:

Now, to add on to this confusing block, I tried using the following
settings and it worked better than the above settings. However, looking
at
the source code, it makes no sense to me why it should work at all. This
is
because the payload size is calculated by using the packet_length read
off
the message passed in multiplied by the items_per_symbol variable. This
would correspond to a huge payload (12832 in my case, instead of
1288),
yet it works.

Header Length (Symbols): 1Items per Symbol: 32

Using the other possible combination that produces the correct header
length when multiplied together, it does not work for any length of
time.

Header Length (Symbols): 2Items per Symbol: 16

I’m starting to think there is a major bug in the HPD block when used
without the ‘Output Format Symbols’ mode. I say this because this is the
last difference between the OFDM example, which seems to work fine, and
my
own. At this point I’m stuck code diving into the HPD block to see if I
can
figure out what’s going on.

v/r,
Rich

On Wed, May 20, 2015 at 6:27 PM, Richard B. [email protected]

reback · May 27, 2015, 6:03pm

Martin,

I’ll update you on this in a new thread soon. We had been running into
the age old problem of plugging one hole and having two more leaks
spring up. We are finally onto something that seems to be fixing
everything. It’s a simple code change in HPD.

Rich

Sent from my iPhone