Debugging overruns

danhobbs · January 27, 2007, 10:48pm

Hi,

I’m trying to figure out what the right tools are to debug problems like
overruns in the USRP. I have a modified version of tunnel.py running (I
basically took out the tun/tap interfaces and added my own on top), and
I’m finding things like after exactly 700 128-byte packets I see an
overrun - every time I run the program. I’ve cut out as much as I can
from my code - removing debug prints, etc, but the problem persists. Do
you have any suggestions as to where to start looking?

Thanks,

Dan

danhobbs · January 28, 2007, 12:14am

On Sat, Jan 27, 2007 at 01:47:09PM -0800, Dan H. wrote:

Thanks,

Dan

overruns or underruns?

Underruns are to be expected with tunnel.py, assuming that you’re not
feeding it data constantly.

(When the in-band signaling stuff is complete, we’ll have a more
sensible interpretation for the underrun case. It’ll only
report a problem if it occurs within a packet, not between packets.)

Are you running with real time scheduling enabled? If you run
tunnel.py as root (or having CAP_SYS_NICE) it’ll be enabled (currently
only implemented on system that implement sched_setscheduler.)

Have you enabled logging? Turn it off.

Linux or some other OS?

Does the unmodified tunnel.py exhibit the same behavior?

Does benchmark_tx.py / benchmark_rx.py work without over/underruns?

Eric

danhobbs · January 28, 2007, 1:26am

Eric B. wrote:

overruns or underruns?

uO means a USRP Overrun right?

Real-time scheduling is enabled. The process gets priority -50. Also,
another (incidental) question, I get really bad performance when the
fusb_options set by realtime being true are used…

Have you enabled logging? Turn it off.

Logging is off.

Linux or some other OS?

Ubuntu 6.10 (but installed before all the recent libtool fun).

Does the unmodified tunnel.py exhibit the same behavior?

Does benchmark_tx.py / benchmark_rx.py work without over/underruns?

No, yes.

I suspect the problem is something to do with randomization; I’m trying
to write a more comprehensive benchmark where I send random payloads.
Using 1024 byte packets (i.e. 1024 random bytes generated per packet), I
can get 524+/- 1 sent. With 768-byte packets, I can send 700+/-1. With
1200-byte packets, I can send 445 packets, more or less. The product of
all of these numbers is close;

1200*445
534000

1024*524
536576

768*700
537600

Perhaps python does something funky after a certain number of bytes? I’m
using these functions:

from random import seed,randint

def rand_init(s=0):
seed(s)

def random_bytes(number):
ret = “”
for i in range(number):
ret += chr(randint(0, 255))
return ret

Or could it be some garbage collection kicking in? I know that this
function is extraordinarily wasteful of memory… Except then it doesn’t
make sense as to why that product would be constant…

-Dan

danhobbs · January 29, 2007, 7:25pm

An unanswered question from before:

Also, another (incidental) question, I get really bad performance when the fusb_options set by realtime being true are used…

What are the fusb_options all about, and how can I get intuition on the
right settings for them?

Also, are you sure you’re not holding onto references to old payloads
somewhere? If you are, no amount of garbage collection or reference
counting will save you
I implemented the above change and a few other optimizations (googling
for python optimization is an effective tactic); but the problem
persists. It does seem to be tied to Python’s randomization choking
after python’s randint was called ~512k times; in particular, 512k =
524288 bytes and I was running into massive overruns after generating:

~445 1200-byte packets (1200445 = 534000)
~524 1024-byte packets (1024524 = 536576)
~700 768-byte packets (768*700 = 537600)

Old payloads are definitely not being kept around; they are processed
and the only output is the number of bit errors in each packet.

I’ll figure out another way around the randomness problem, maybe using
an LRS or something.

-Dan

danhobbs · January 28, 2007, 2:56am

On Sat, Jan 27, 2007 at 04:25:19PM -0800, Dan H. wrote:

Eric B. wrote:

overruns or underruns?

uO means a USRP Overrun right?

Yes.

Linux or some other OS?

536576

def random_bytes(number):
ret = “”
for i in range(number):
ret += chr(randint(0, 255))
return ret

Or could it be some garbage collection kicking in? I know that this
function is extraordinarily wasteful of memory… Except then it doesn’t
make sense as to why that product would be constant…

I think you’re burning up all the cycles constructing the string
of random bytes. Building the string byte by byte is very expensive.
Basically O(N^2).

Try this:

def random_bytes(number):
return ‘’.join([chr(randint(0, 255)) for x in range(number)])

Also, are you sure you’re not holding onto references to old payloads
somewhere? If you are, no amount of garbage collection or reference
counting will save you

Eric

danhobbs · September 25, 2007, 11:08pm

On Mon, Jan 29, 2007 at 10:23:15AM -0800, Dan H. wrote:

An unanswered question from before:

Also, another (incidental) question, I get really bad performance when the fusb_options set by realtime being true are used…

What are the fusb_options all about, and how can I get intuition on the
right settings for them?

They set the amount of buffering being done in the Fast USB interface.
Under Linux it goes like this (NetBSD is similar):

block_size is the size of the transfer made to/from the kernel. It
must be a multiple of 512 bytes. Bigger block_sizes give lower
overhead (fewer kernel calls to move a given number of bytes),
however if you’re trying to reduce worst case latency (particularly
important in transceiver apps trying to do carrier sense), smaller
values are better.

nblocks is the maximum number of blocks that are scheduled for i/o
at any time. Under Linux we use a usbfs ioctl to asynchronously
submit multiple requests. If you don’t care about latency, the
default value (not specifying the fusb_nblocks ctor arg) is fine.
If you’re trying to reduce your worst case latency, then smaller
values of nblocks are better. The minimum that I’ve ever seen work
is 4.

When running as realtime, it’s possible to run with less buffering
since the USRP library code doesn’t get preempted by the X-server, etc.

The values specified in tunnel.py were found by experimentation with my
X30 and X61 laptops. (1.4 GHz Pentium M and 1.8 GHz Core Duo
respectively).

Also, are you sure you’re not holding onto references to old payloads
somewhere? If you are, no amount of garbage collection or reference
counting will save you

Old payloads are definitely not being kept around; they are processed
and the only output is the number of bit errors in each packet.

Good.

I’ll figure out another way around the randomness problem, maybe using
an LRS or something.

If you don’t need high quality randomness a linear congruent pseudo-
random generator will do the trick in a few operations. See Knuth.

You might try reading /dev/urandom, however your tests won’t be
reproducible.

Eric