Inband timestamp issues

ketancmaheshwari · July 23, 2008, 1:00am

My colleague Steve and I have been working with the inband receive
code in an effort to understand how the timestamps work. Specifically,
we have been printing the timestamps from the usrp_rx block while
running test_usrp_inband_2rx.cc in usrp/host/apps-inband. There is
some puzzling behavior that we have observed and unable to explain. We
are operating with a decimation rate of 64 (i.e. a 1 MHz sample rate).

Note, this is using the gnuradio trunk rev 8586, with the
gnuradio/trunk/usrp/ code unchanged except to print timestamp data in
usrp_rx.cc. There don’t seem to be any major changes since this
revision.

Every so often we notice that the timestamp difference between
packets from channel 0 and channel 1 sometimes changes. This change in
timestamp difference is also accompanied by a sudden spike in the
timestamp value followed by about 7 usb_packets with very similar
timestamps. For example, the trace might look like this for a while:

[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 100000, diff =
7802
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 100262, diff =
262
[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 108064, diff =
7802
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 108326, diff =
262
…,
where diff = the difference between the current and previous
timestamp; and then we get a sudden jump in the timestamp value:

[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 180640, diff =
7802
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 511000, diff =
330360
[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 511990, diff =
990
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 512988, diff =
998
[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 513982, diff =
994
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 514972, diff =
990
[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 517138, diff =
2166
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 518128, diff =
990

and then normal operation will resume, but with a new delay between
the two channels:

[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 525170, diff =
7042
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 526194, diff =
1024
[usrp_rx]: chan 0 received packet len = 504 bytes, ts = 533235, diff =
7041
[usrp_rx]: chan 1 received packet len = 504 bytes, ts = 534258, diff =
1023

This behavior can occur multiple times during the operation of
test_usrp_inband_2rx.cc and results in a new delay offset between the
channels. We noticed that the number of transcient timestamps is
approximately equal to the number of usb_packets that would occupy the
size of the fifo buffers (16Kbits) that feed rx_buffer_inband in the
FPGA code. We think that somehow, the operation of rx_buffer_inband is
stalling which would cause these buffers to fill up and then get
dumped out at once (which might explain the transient behavior above).
Any ideas what might be causing these random jumps in the timestamp?

Although this is a made up trace, it is indicative of the behavior we
have observed. I can send a real trace if anyone feels it is
necessary.

Occasionally we see an out of order packet arrive at usrp_rx in the
inband code. This timestamp is usually for a packet that should arrive
sometime in the future. We are clueless as to why this might be
happening.

This seems to happen 1 in every 5k-10k usb_packets.

–
Ketan M.

ketancmaheshwari · July 23, 2008, 1:38am

On Tue, Jul 22, 2008 at 05:59:17PM -0500, Ketan M. wrote:

My colleague Steve and I have been working with the inband receive
code in an effort to understand how the timestamps work. Specifically,
we have been printing the timestamps from the usrp_rx block while
running test_usrp_inband_2rx.cc in usrp/host/apps-inband. There is
some puzzling behavior that we have observed and unable to explain. We
are operating with a decimation rate of 64 (i.e. a 1 MHz sample rate).

Note, this is using the gnuradio trunk rev 8586, with the
gnuradio/trunk/usrp/ code unchanged except to print timestamp data in
usrp_rx.cc. There don’t seem to be any major changes since this revision.

Paging George N.,… Paging George N…
Please come to the white courtesy telephone…

Eric

ketancmaheshwari · July 23, 2008, 8:15pm

Hi George,

Thanks for the quick and thought-out response. I actually tried with
the new rbf yesterday. It produces no noticeable differences in the
output I’m interested in at this point.

Second, we are not seeing any underrun. I have verified this in two
ways. First, I check pkt->underrun() after reading the packet in
usrp_rx.cc. Second, I logged the packets to file, ran ./read_packets
(redirecting output to another file), and did a
case-insensitive grep of that new file for “underrun”. Nothing ever
comes up.

However, to see if the usrp_rx mblock is being starved, I turned on
some verbose printing in the usrp_server and usrp_usb_interface. The
following is a typical sequence:

(also, long reads by the usrp_rx are often preceded by long gaps where
the usrp_rx is not executing)

In short, it appears as if the timestamp spike is occurring in the
middle of a string of successive packets read from the usrp without
context switching.

Anyway, I figured one solution might be to have the usrp_rx do an
omni_thread::yield() operation after reading several packets. I tried
this out with various numbers of maximum iterations of the while loop
(80, 30, 8), and the problem persists. It seems to happen in the
middle of the longest burst of packets, if that makes sense. So if I
yield after 80 packets, the timestamp will spike around the 47th
packet. If I yield after 30 packets, the timestamp spikes at around
the 12th packet.

The strange thing to us is that the timestamp spike invariably changes
the “steady state” operation of the timestamp offset between the two
channels. This says “FPGA” to us, especially since we aren’t getting
an underrun flag.

Steve

ketancmaheshwari · July 23, 2008, 5:47am

Eric B. wrote:

usrp_rx.cc. There don’t seem to be any major changes since this revision.

Paging George N.,… Paging George N…
Please come to the white courtesy telephone…

I’ve known about a inband timestamp issue. but I’m not sure its the
same.

I’ve fixed this in the code a while ago, but never committed new RBFs
from being busy to build them.

Here is the fix I made on 05/15/08:
http://gnuradio.org/trac/changeset/8430

I didn’t actually get to build the new RBFs until earlier this week:
http://gnuradio.org/trac/changeset/8924
http://gnuradio.org/trac/changeset/8935

I’m not sure this fix explains the jumps in timestamps you’re seeing, as
it was mainly a reset issue causing the TX and RX timestamps to be on
different clocks. But, I’d like you to try it out with the new RBFs so
we are at least on the same page and I know you’re working with the
newest and greatest RBFs. Then if the issue still occurs, I can poke
before I merge to trunk.

If you want my opinion though, I don’t think its an FPGA issue, I think
that its an underrun issue. Can you try checking if the underrun flag
is set on any of the packets around your sudden jumps? You can use the
read_packets application in usrp/host/apps-inband to read a packet dump.
You can enable the packet dump by editing
usrp/host/lib/inband/usrp_rx.cc and setting d_disk_write(true) and
d_write_pkt(true):
http://gnuradio.org/trac/browser/gnuradio/branches/developers/gnychis/inband/usrp/host/lib/inband/usrp_rx.cc#L47

Just use read_packets on your dump, which should be something like
rx_data_chan0.dat and rx_data_chan1.dat … it should print underrun on
the packet if it was set.

The 2TX/RX stuff in the inband code is very iffy in my opinion. All of
the message passing going on and the context switching between the
m-blocks means that the lowest level usrp_rx block might be waiting a
while before it gets time on the CPU again … leaving the FPGA to drop
USB blocks in the mean time. But, the underrun flag should tell you
this.

George

ketancmaheshwari · July 23, 2008, 8:27pm

Steve P. wrote:

Hi George,

Thanks for the quick and thought-out response. I actually tried with
the new rbf yesterday. It produces no noticeable differences in the
output I’m interested in at this point.

Not getting an underrun is good. There may very well be an issue with
the FPGA. Could you do me a favor and see if you get these timestamp
jumps using the single RX chain application? I’d be interested to know
if it is something due to the complication of using two RX chains, or if
there is something inherently wrong with the timestamp. I know what
you’re interested in is the timestamp gap between the two channels, and
it being consistent, but I want to know if you get these timestamp jumps
using the single chain.

It just helps me narrow down the issue. There was a point in time where
I wasn’t even going to release a 2TX/RX RBF because it had some fitting
issues in the FPGA with all of the new inband stuff. I had to clean up
some unused code to get it to fit. The only reason I actually made an
effort to build the 2TX/RX RBF was because of your research group

I spend all of my time working with the single chain, and if I find bugs
I can fix them. But I think you’re the only ones using the dual
chain… so your feedback is useful and I’d appreciate any help solving
the problems. I don’t really know Verilog, and our Verilog coder is at
an internship right now

George

ketancmaheshwari · July 23, 2008, 8:31pm

George,

I am sure that Steve will check the single rx-chain and get back to
you with the results of that test.

Although, I have a hypothesis about what might be happening. Is it
possible that there are some control signals being sent to the USRP
over the command channel that disable the rx_buffer_inband while they
execute? Perhaps this is happening unbeknownst to us. I guess the more
general question here is what events/conditions cause the
packet_builder to be disabled?

On Wed, Jul 23, 2008 at 1:28 PM, George N. [email protected] wrote:

wasn’t even going to release a 2TX/RX RBF because it had some fitting issues

George

–
Ketan M.

ketancmaheshwari · July 23, 2008, 8:34pm

Ketan M. wrote:

packet_builder to be disabled?

The packet builder is never disabled unless you completely stop the RX
chain using the low level USRP reference to it (e.g., d_urx->stop() or
something similar). This is never executed in the example you are
using, so the packet builder should at no point be disabled… it’s
always working, even when control stuff is going on.

George

ketancmaheshwari · July 23, 2008, 8:32pm

George N. wrote:

I spend all of my time working with the single chain, and if I find bugs
I can fix them. But I think you’re the only ones using the dual
chain… so your feedback is useful and I’d appreciate any help solving
the problems. I don’t really know Verilog, and our Verilog coder is at
an internship right now

The main reason I bring this up is that Brian and I were talking on IRC
about the issue earlier, and agreed that a testbench for the FPGA
timestamp would be useful in ensuring the timestamps are behaving
properly… but I don’t really know Verilog to do this easily So if
any of you know Verilog and want to collaborate on helping fix the
issue, it would be appreciated.

George

ketancmaheshwari · July 23, 2008, 8:40pm

execute? Perhaps this is happening unbeknownst to us. I guess the more

George

Just a quick follow-up to this. Last night I printed out diffs in the
std::clock() function along with the timestamp diffs right after the
packets are read. There are no jumps in the std::clock(), which tells
me the timestamp jump is caused by something strange like the clock
being overwritten and not an actual delay.

I’ll get back to you on the other requested tests.

Steve

ketancmaheshwari · July 24, 2008, 1:23am

Steve P. wrote:

Just a quick follow-up to this. Last night I printed out diffs in the
std::clock() function along with the timestamp diffs right after the
packets are read. There are no jumps in the std::clock(), which tells
me the timestamp jump is caused by something strange like the clock
being overwritten and not an actual delay.

I’ll get back to you on the other requested tests.

This is interesting… I decided to run some of my own tests. I did a
capture of ~10k packets with a single RX chain and I see the same issue.

I used a decimation rate of 64, and therefore the timestamps on the
recorded packets should be 64*126samples=8064 ticks apart.

I recorded 8 timestamp jumps of the following sizes:
10264763
1003
985
5569
1738245
1003
987
5557

Very odd, but there’s got to be something that explains it I’m
looking in to this now.

George

ketancmaheshwari · August 19, 2008, 12:56am

Hi Steve,

We certainly appreciate the work you’ve done (along with Thibauld et
al.) towards the inband functionality!

Thanks! It’s been a major group effort along with Eric, Thibaud, Leo,
and Brian.

Anyway, we’ve found a solution that seems to work, although I’m having
troubles testing it thoroughly. Basically, instead of having both the
interleaving and non-interleaving functionalities available in the
same rbf, we have made a new rbf that has all the USB packet/timestamp
functionality but assumes symbol interleaving. For our applications,
we don’t anticipate needing the flexibility of independent transmit
channels.

Okay, this sounds like a reasonable solution. I think that multiple
RBFs is not a bad thing. Especially since its a functionality that is
quite exclusive… i don’t think its something you’re likely to turn off
and on in a single application. You should be able to do the
interleaving at the host easily.

concern is exacerbated by a demo next month where we hope to use 4x4.
So we are taking the safer, yet messier, route for the moment.

hmmmm yeah, it was meant to be 1 usrp_server to 1 usrp. I’m not sure
you really need to do this. You just need to make sure that both USRP’s
clocks are synchronized. From that, the timestamp (which currently has
some issues) should allow you to transmit simultaneously on both.

Anyway, I wanted to give you (and the list in general) a status update
and let you know that we’ll be happy to collaborate on anything
regarding the inband code. Feel free to contact me and/or Ketan
outside of the list as well.

Thanks Steve! I will probably keep in contact with you so that you can
test some updates we have on the timestamp issue you found. I’m about
to send another mail to the list about this.

George

ketancmaheshwari · August 19, 2008, 1:54am

Okay, more on topic with the original problem: the timestamp issues.
The problem that Steve noticed was that the timestamps jump.

Brian and I wrote a testbench for this and found that the timestamp
actually does not really correspond to the first sample (per the design
doc), and not quite the last sample either… but closer to the last
sample. So… this is also an issue. But I think that fixing both will
be along the same lines.

Brian, can you verify all of this? I wanted to mail it to the list for
documentation, and so when Leo gets back he can read it. I’ve been
slammed with traveling in August and wont be back for another week.

So what we suspect is that the FX2 will keep up for a while, and then
get suspended/busy/something for a while which causes it to stop
building packets. The samples are still being buffered, but the actual
packets are not being built (causing a jump).

In this time, nothing becomes full and dropped, which is why an underrun
is not reported. All samples make it through to the host.

Brian’s solution was the following:

connect the input of packet_builder to be the output of the halfband
filter (for each channel)
output the previous to an asynchronous FIFO driven by rxclk
have a FIFO arbiter on the output running off usbclk
FIFO arbiter cycles through the FIFOs for a packet
if packet is available, connect FIFO to FX2 and wait for FX2 to
parse

Does that sound about right, Brian?

Two major issues we’re looking at solving: timestamping the first sample
(for real), and a non-jumpy timestamp.

Thanks!
George

ketancmaheshwari · August 19, 2008, 2:49am

On Mon, Aug 18, 2008 at 7:54 PM, George N. [email protected] wrote:

documentation, and so when Leo gets back he can read it. I’ve been slammed
with traveling in August and wont be back for another week.

Yes. There are issues, but enjoy traveling.

So what we suspect is that the FX2 will keep up for a while, and then get
suspended/busy/something for a while which causes it to stop building
packets. The samples are still being buffered, but the actual packets are
not being built (causing a jump).

In this time, nothing becomes full and dropped, which is why an underrun is
not reported. All samples make it through to the host.

Just to clarify the situation - the current packet_builder is hooked
up after the sample buffers so as it is building one packet, it cannot
build another. That is why when 2 channels were used, they would be
received 270+ samples from each other as that is around the time it
takes to generate the full packet. Moreover, the timestamp actually
had nothing to do with the actual sample time - just the time the
packet was built. Therefore, no 2 packets could have the same
timestamp (which should be possible).

Brian’s solution was the following:

connect the input of packet_builder to be the output of the halfband
filter (for each channel)

output the previous to an asynchronous FIFO driven by rxclk

have a FIFO arbiter on the output running off usbclk

FIFO arbiter cycles through the FIFOs for a packet

if packet is available, connect FIFO to FX2 and wait for FX2 to parse

Does that sound about right, Brian?

Correct. Multiple packet_builders which allow for proper timestamping
along with some artiber to get the packets out to the FX2.

Two major issues we’re looking at solving: timestamping the first sample
(for real), and a non-jumpy timestamp.

Yep!

Thanks!
George

No problem.

Brian

ketancmaheshwari · August 19, 2008, 4:30am

On Mon, Aug 18, 2008 at 04:54:20PM -0700, George N. wrote:

Okay, more on topic with the original problem: the timestamp issues. The
problem that Steve noticed was that the timestamps jump.

Brian and I wrote a testbench for this and found that the timestamp
actually does not really correspond to the first sample (per the design
doc), and not quite the last sample either… but closer to the last
sample. So… this is also an issue. But I think that fixing both will be
along the same lines.

George,

The timestamp semantics that we are using on the USRP2 is that for Rx,
the timestamp refers to the time that we clock the sample out of the
DSP Rx pipeline (n.b., not the Rx FIFO). For transmit, it’s the time
we clock the sample into the TX pipeline (not the Tx FIFO). In both
cases, the reference point is at the edge of the DSP pipeline. This
seems to be the only place that makes sense, since the group delay
through the pipeline varies depending on the configuration. In all
cases, the timestamp is in ticks of the master clock.

Does this make sense?

Eric

ketancmaheshwari · August 18, 2008, 1:21am

Hi again George,

I spend all of my time working with the single chain, and if I find bugs I
can fix them. But I think you’re the only ones using the dual chain… so
your feedback is useful and I’d appreciate any help solving the problems. I
don’t really know Verilog, and our Verilog coder is at an internship right
now

We certainly appreciate the work you’ve done (along with Thibauld et
al.) towards the inband functionality!

The main reason I bring this up is that Brian and I were talking on IRC
about the issue earlier, and agreed that a testbench for the FPGA timestamp
would be useful in ensuring the timestamps are behaving properly… but I
don’t really know Verilog to do this easily So if any of you know
Verilog and want to collaborate on helping fix the issue, it would be
appreciated.

Sorry I had missed this part before. We would be happy to help,
although our knowledge of Verilog is probably inferior. I’m not just
saying that to be modest.

We have been focusing on the TX chain at the moment. As has been
discussed in two or three places before on this list, the treatment of
individual channels as independent on the tx side puts severe
limitations on MIMO transmission. In particular, the USB buffer is
likely to fill up with packets for one channel, while the packets for
the other channel are queued and end up missing their scheduled time
of departure. This was first discussed between Thibauld and Brian
when designing the FPGA inband code. Brian said the solution would be
for the host to recognize when this would happen and split up large
packets… interleaving packets instead of symbols as before.

A guy who used to work on our project, Sanmi, had also contacted this
list about the problem, and you came up with a good solution of adding
an “interleaving” logical channel to the FPGA code, but you were
concerned about fitting this on the FPGA. Interleaving the packets on
the host end didn’t seem to be a good solution given how the host
ended up being designed.

Anyway, we’ve found a solution that seems to work, although I’m having
troubles testing it thoroughly. Basically, instead of having both the
interleaving and non-interleaving functionalities available in the
same rbf, we have made a new rbf that has all the USB packet/timestamp
functionality but assumes symbol interleaving. For our applications,
we don’t anticipate needing the flexibility of independent transmit
channels.

We’re currently in the stage of writing the host code for using
multiple USRPs. Previously, we had host code written to use the
inband_2rxhb_2tx.rbf with both transmit and receive on a single USRP,
and this was relatively successful. Our current strategy is a little
ugly; we’re modifying the inband code so that multiple USRPs are
controlled with a single usrp_server. I realize this is not how the
code was meant to be utilized, but we’ve got concerns about sending
data (which is supposed to be transmitted simultaneously) to two
independent usrp_servers and having them both arrive on time. This
concern is exacerbated by a demo next month where we hope to use 4x4.
So we are taking the safer, yet messier, route for the moment.

Anyway, I wanted to give you (and the list in general) a status update
and let you know that we’ll be happy to collaborate on anything
regarding the inband code. Feel free to contact me and/or Ketan
outside of the list as well.

George

Steve

ketancmaheshwari · August 19, 2008, 6:48pm

Thanks Brian!

Brian P. wrote:

Just to clarify the situation - the current packet_builder is hooked
up after the sample buffers so as it is building one packet, it cannot
build another. That is why when 2 channels were used, they would be
received 270+ samples from each other as that is around the time it
takes to generate the full packet. Moreover, the timestamp actually
had nothing to do with the actual sample time - just the time the
packet was built. Therefore, no 2 packets could have the same
timestamp (which should be possible).

I would think that if 2 channels are used, you would be getting a packet
from each channel with the same timestamp. I don’t think there should
be any offset between them. Right, so the current timestamp is more of
a “I built this packet at time X.”

As a non-hardware person, I was thinking that a simple solution is to
not really use the clock on the RX directly, but timestamp packets
indirectly. You have a signal when the sample buffer is empty, when
that signal is de-asserted for the first time, read the clock and take
this as your initial RX clock time.

Now, you go by the fact that you’re sampling at a constant rate and
timestamp packets based on that. The next packet gets timestamped at
last_timestamp + 126samples_per_packet * decimation_rate.

That way you’re not timestamp on packet building, but logically the time
the first sample of the packet arrived at the RX buffer.

George

ketancmaheshwari · August 19, 2008, 6:53pm

This sounds correct. I think the major issue is just that the current
implementation is taking the time the packet is built as the timestamp.
This is far from correct. Because of this, we don’t have a timestamp
that represents the first or last sample, and if the packet builder is
halted then we get these timestamp “jumps.”

George

ketancmaheshwari · August 21, 2008, 12:54am

That way you’re not timestamp on packet building, but logically the time the
first sample of the packet arrived at the RX buffer.

Correct me if I misunderstand, but you’re not actually clocking
anything this way. Theoretically it should work, but there’s nothing
telling me there weren’t actually a million clock cycles between two
packets. There should definitely be a timestamp clock tied to the
master clock, in my opinion.

Ketan and I discussed a couple possibilities earlier today. One might
be to have a timestamp buffer in parallel with the RX FIFO. Although
problems arise with making sure the timestamp changes at the border of
packets, unless there is one timestamp per sample, which is almost
certainly too excessive.

Another idea is to move the FIFO to the other side of the packet
builder, as long as the packet builder can keep up with the
ADC/decimator output. It then outputs to the buffer. This was
Ketan’s idea, so I apologize if I’m communicating it incorrectly.

George

Steve

ketancmaheshwari · August 21, 2008, 1:13am

On Wed, Aug 20, 2008 at 6:45 PM, Steve P. [email protected]
wrote:

That way you’re not timestamp on packet building, but logically the time the
problems arise with making sure the timestamp changes at the border of
packets, unless there is one timestamp per sample, which is almost
certainly too excessive.

Another idea is to move the FIFO to the other side of the packet
builder, as long as the packet builder can keep up with the
ADC/decimator output. It then outputs to the buffer. This was
Ketan’s idea, so I apologize if I’m communicating it incorrectly.

The current inband RX chain looks like:

N RX Sample Streams -> N RX Sample FIFOs -> 1 Packet Builder -> 1

USB FIFO → FX2 USB Interface

What it should look like (in my opinion):

N RX Sample Streams -> N Packet Builders -> N Packet FIFOs -> N:1

FIFO MUX → FX2 USB Interface

I believe the TX chain looks more like this, except with the arrows
going in the opposite direction.

On a side note, it might be interesting to have a command that can
turn on the receiver and receive a specific number of inband packets.
For example, if you know you may be receiving a transmission that is
only 2ms long in a specific slot, it might be beneficial to only
schedule 2ms (+/- a guard time) worth of samples to be delivered to
the host, freeing up more CPU cycles for signal processing and using
the USB bandwidth a little more efficiently.

Comments or suggestions?

Brian

ketancmaheshwari · August 21, 2008, 2:49pm

On Thu, Aug 21, 2008 at 3:11 AM, Eric S.
[email protected] wrote:

Hi all, I’d really like to help with this. I’ve been pouring over the
design (usrp_inband_usb), so my brain is a little mushy at the moment,
hopefully I’m not totally off track here.

Is it true that there are N RX Sample FIFOs? It seems that the channels
are already muxed by the time they get to the packet_builder, no?

The packet builder itself actually does the muxing:

http://www.gnuradio.org/trac/browser/gnuradio/trunk/usrp/fpga/inband_lib/packet_builder.v

The N RX sample FIFOs are built here:

http://www.gnuradio.org/trac/browser/gnuradio/trunk/usrp/fpga/inband_lib/rx_buffer_inband.v#L168

Can we just subtract the FIFO size (usedw?) from the timestamp in the
packet builder?

Sure, but what for? I suppose I am not sure what this is addressing?

When checking timestamps for expired command/data packets, is an
overflow/wraparound on the timestamp_clock handled? (ref cmd_reader.v:107,
chan_fifo_reader.v:145)

I don’t think it’s explicitly being taken care of, but I am also not
sure if the logic just ends up working out. This should probably be
looked at and a testbench written.

I wholeheartedly agree that it would be nice to have a “read N samples at
time T command”.

–ets

Brian