Benchmark_* not working correctly

Dev_R · September 20, 2007, 8:13pm

Hi all,

I have three computers running with gnuradio, all updated to the
latest
svn, all "sudo make uninstall"ed and "make distclean"ed first. I’m
running into a strange problem. When running benchmark_loopback.py in
the digital examples folder on my desktop (running Debian) and on my
laptop (running Fedora core 6), I get the following results:

ok = False pktno = 8355 n_rcvd = 1 n_right = 0
ok = False pktno = 19 n_rcvd = 2 n_right = 0
ok = False pktno = 46 n_rcvd = 3 n_right = 0
ok = False pktno = 33 n_rcvd = 4 n_right = 0
ok = False pktno = 196 n_rcvd = 5 n_right = 0

I have another desktop set up with exactly the same hardware specs as
the one above. This one is running fedora core 6, and executing
benchmark_loopback.py gives me:

ok = True pktno = 0 n_rcvd = 1 n_right = 1
ok = True pktno = 1 n_rcvd = 2 n_right = 2
ok = True pktno = 2 n_rcvd = 3 n_right = 3
ok = True pktno = 3 n_rcvd = 4 n_right = 4
ok = True pktno = 4 n_rcvd = 5 n_right = 5

I’ve cleaned out and reinstalled gnuradio from all three computers
repeatedly, but I end up with the same result every time. Any
suggestions?

Thanks,
Dev

Dev_R · September 20, 2007, 8:40pm

Dev R. wrote:

I’ve cleaned out and reinstalled gnuradio from all three computers
repeatedly, but I end up with the same result every time. Any suggestions?

Can you check the version of Python and location of GNU Radio on each:

$ python
Python 2.5.1 (r251:54863, May 2 2007, 16:27:44)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import gnuradio
gnuradio.file
‘/usr/local/lib/python2.5/site-packages/gnuradio/init.pyc’
from gnuradio import blks2
blks2.file
‘/usr/local/lib/python2.5/site-packages/gnuradio/blks2/init.pyc’
from gnuradio import blks2impl
blks2impl.file
‘/usr/local/lib/python2.5/site-packages/gnuradio/blks2impl/init.pyc’

Finally, the output of ./config.guess on each:

$ ./config.guess
x86_64-unknown-linux-gnu

Thanks.

–
Johnathan C.
Corgan Enterprises LLC
http://corganenterprises.com

Dev_R · September 20, 2007, 8:59pm

Johnathan C. wrote:

Type “help”, “copyright”, “credits” or “license” for more information.
Finally, the output of ./config.guess on each:

$ ./config.guess
x86_64-unknown-linux-gnu

Thanks.

For the Debian desktop:

Python 2.4.4 (#2, Jul 21 2007, 11:00:24)
[GCC 4.1.3 20070718 (prerelease) (Debian 4.1.2-14)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import gnuradio
gnuradio.file
‘/usr/local/lib/python2.4/site-packages/gnuradio/init.pyc’

from gnuradio import blks2
blks2.file
‘/usr/local/lib/python2.4/site-packages/gnuradio/blks2/init.pyc’

from gnuradio import blks2impl
blks2impl.file
‘/usr/local/lib/python2.4/site-packages/gnuradio/blks2impl/init.pyc’

./config.guess :
i686-pc-linux-gnu

For the Fedora desktop:

Python 2.4.4 (#1, Oct 23 2006, 13:58:18)
[GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import gnuradio
gnuradio.file
‘/usr/local/lib64/python2.4/site-packages/gnuradio/init.pyc’

from gnuradio import blks2
blks2.file
‘/usr/local/lib64/python2.4/site-packages/gnuradio/blks2/init.pyc’

from gnuradio import blks2impl
blks2impl.file
‘/usr/local/lib64/python2.4/site-packages/gnuradio/blks2impl/init.pyc’

./config.guess :
x86_64-unknown-linux-gnu

I reverted the debian laptop to a previous version of the trunk(6400) to
see if that would help any, but it didnt seem to. It’s still re-making
to the current version right now, but its ./config.guess is:

i686-pc-linux-gnu

Hope that helps, thanks!
Dev

Dev_R · September 20, 2007, 9:14pm

Dev R. wrote:

Hope that helps, thanks!

It helps in the sense that it confirms a correct installation

The failure symptoms are odd. The system shows receiving five packets,
which indicates that modulator/demodulator and framer/deframer are
probably working correctly.

The packet number is encoded at the beginning of the payload. The fact
that each of these is corrupted, and that the packets fail CRC check
("ok = False’), indicates the failure is likely happening in the
Python-based packet handling framework.

–
Johnathan C.
Corgan Enterprises LLC
http://corganenterprises.com

Dev_R · September 22, 2007, 4:22pm

Were either of you able to resolve this? Any hints on where I might try
to
find the problem? I spent some time digging around pkt.py.

I get similar results running svn 6504 under Debian testing (Lenny).
This
was a fresh install on a clean machine. I do not see the same problems
on a
similar debian machine running an older version of the trunk.

ok = False pktno = 14 n_rcvd = 1 n_right = 0
ok = False pktno = 189 n_rcvd = 2 n_right = 0
ok = False pktno = 98 n_rcvd = 3 n_right = 0
ok = False pktno = 134 n_rcvd = 4 n_right = 0
ok = False pktno = 156 n_rcvd = 5 n_right = 0
ok = False pktno = 207 n_rcvd = 6 n_right = 0
ok = False pktno = 222 n_rcvd = 7 n_right = 0
ok = False pktno = 238 n_rcvd = 8 n_right = 0
ok = False pktno = 218 n_rcvd = 9 n_right = 0
ok = False pktno = 358 n_rcvd = 10 n_right = 0

Dev_R · September 20, 2007, 9:49pm

Johnathan C. wrote:

The packet number is encoded at the beginning of the payload. The fact
that each of these is corrupted, and that the packets fail CRC check
("ok = False’), indicates the failure is likely happening in the
Python-based packet handling framework.

Just to clarify, the Fedora desktop is receiving every packet in the
correct order with passed CRC, while the other computers seem to only
pick up maybe 1/20th of the sent packets and never pass the CRC check.
These computers all ran these tests correctly early last month iirc.
I’ll continue to look into it.

Thanks,
Dev

Dev_R · September 22, 2007, 6:29pm

Tim M. wrote:

ok = False pktno = 218 n_rcvd = 9 n_right = 0
ok = False pktno = 358 n_rcvd = 10 n_right = 0

Just to confirm–the above packets have gaps in the receive times?
There are periods where no packets are received?

–
Johnathan C.
Corgan Enterprises LLC
http://corganenterprises.com

Dev_R · September 22, 2007, 6:44pm

I suspect that the “pktno” from below is just garbage from false syncs.

I placed some prints in pkt.py

send_pkt places a message on the queue 667 times.

in class queue_watcher_thread
under run
msg = self.rcvd_pktq.delete_head() occurs 27 times

Tim

Dev_R · September 25, 2007, 7:54pm

I’ve been looking into this problem further. I apologize in advance for
the long post. To make things easier, I’ll represent the three computers
I’m using as follows:

32deb - 32bit Debian desktop
32red - 32bit Fedora laptop
64red - 64bit Fedora desktop

benchmark_loopback.py:

Does not appear to work properly on either 32deb or 32red. When I set
the internal noise channel on in the script, I get random results as
follows. Judging from the correctly running script on the 64 bit system,
this detection happens every 15-20 “real” packets:

ok = False pktno = 8355 n_rcvd = 1 n_right = 0
ok = False pktno = 19 n_rcvd = 2 n_right = 0
ok = False pktno = 46 n_rcvd = 3 n_right = 0
ok = False pktno = 33 n_rcvd = 4 n_right = 0
ok = False pktno = 196 n_rcvd = 5 n_right = 0
… etc.

with different values on each run. When I take the noise channel off
(channelon = False), I get the following results on 32deb (the time
between every packet detection is much longer, about 10-20 times as long
with high variability):

ok = False pktno = 32 n_rcvd = 1 n_right = 0
ok = False pktno = 1 n_rcvd = 2 n_right = 0
ok = False pktno = 32997 n_rcvd = 3 n_right = 0
ok = False pktno = 142 n_rcvd = 4 n_right = 0
ok = False pktno = 109 n_rcvd = 5 n_right = 0
… etc.

Running the above repeatedly gives the same pktno and timing every run.
The same behavior happens on 32red, but the pktnos and timings are
different. I’ve tried running the CPU at close to 100% utilization on
both 32deb and 32red to see if there is some kind of over/underflow
occurring, but it didn’t change the above results.

64red runs correctly, giving the following output:

ok = True pktno = 0 n_rcvd = 1 n_right = 1
ok = True pktno = 1 n_rcvd = 2 n_right = 2
ok = True pktno = 2 n_rcvd = 3 n_right = 3
ok = True pktno = 3 n_rcvd = 4 n_right = 4
ok = True pktno = 4 n_rcvd = 5 n_right = 5
ok = True pktno = 5 n_rcvd = 6 n_right = 6
… etc.

benchmark_rx.py and benchmark_tx.py:

I’ve tried running benchmark_rx and tx with the following
configurations, in both directions:

32deb <-> 32red
32deb <-> 64red
32red <-> 64red
64red USRP 0 <-> 64red USRP 1

using:
./benchmark_tx.py -T A -f 2.4G
./benchmark_rx.py -R A -f 2.4G

In no case was the receiver able to demod any packets at all. In every
run I tested usrp_fft.py to verify that a signal was coming through on
the correct frequency. I exchanged the USRP from 32red to run on 64red,
but that didn’t help either.

I created a file sink after the amp block in benchmark_tx on each
computer and tested the beginning of each file against the others. It
seems like the benchmark_tx script is sending the same data to the USRP
in every case, which means the problem is in the RX side(?)

./benchmark_ofdm.py

Works correctly on every computer, showing:

ok: True pktno: 0 n_rcvd: 1 n_right: 1
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

ok: True pktno: 1 n_rcvd: 2 n_right: 2
0101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101
etc…

./benchmark_ofdm_rx.py and ./benchmark_ofdm_tx.py:

Same configurations as above were tried. In every case except “64red
USRP 0 <-> 64red USRP 1”, nothing was demoded at all. With the 64red
transmitting to two connected USRPS with:

./benchmark_ofdm_rx.py -U 1 -R A -f 2.4G
./benchmark_ofdm_tx.py -U 0 -T A -f 2.4G

(added -U to the options to select USRP) I get:

ok: False pktno: 84 n_rcvd: 1 n_right: 0
ok: False pktno: 85 n_rcvd: 2 n_right: 0
ok: False pktno: 86 n_rcvd: 3 n_right: 0
ok: False pktno: 87 n_rcvd: 4 n_right: 0
ok: False pktno: 88 n_rcvd: 5 n_right: 0

The detection is strangely sporadic, with long series of packets not
being demoded, followed by 3-4 second bursts of demods.

Lastly, running the Packet mod/demod example in GRC in the latest
version works correctly on every system. It displays the sine wave
correctly, and changes it appropriately with parameter modification.

Any suggestions on how to proceed?

Thanks,
Dev

Dev_R · September 26, 2007, 10:01pm

I have some additional hints.

When I run with 2 samples per symbol (the default)
./benchmark_loopback.py -s 20 -M 0.004 -S 2
ok = False pktno = 19 n_rcvd = 1 n_right = 0
ok = False pktno = 7 n_rcvd = 2 n_right = 0
ok = False pktno = 180 n_rcvd = 3 n_right = 0
…

Consistent with Dev

When I run with 4 samples per symbol
./benchmark_loopback.py -s 20 -M 0.004 -S 4
ok = False pktno = 2 n_rcvd = 1 n_right = 0
ok = True pktno = 3 n_rcvd = 2 n_right = 1
ok = False pktno = 6 n_rcvd = 3 n_right = 1
…

So it is still broken but seems to perform “better” with 4 samples per
symbol. I have spent some time looking through all of the blocks and
looking through the .dat files with logging turned on. The data data
looks
OK up until gr_mpsk_receiver_cc. I spent a good amount of time looking
through this file but I suspect the problem lies elsewhere because this
file
does not seem to have changed other than comments since rev 5873

I will keep looking

Tim

Dev_R · September 27, 2007, 5:15pm

Running benchmark_loopback.py with gmsk (-m gmsk) seems to work
correctly for all my systems. Unfortunately, the same doesn’t seem to be
true of the benchmark_rx and benchmark_tx, which use gmsk by default.
Changing them to any of the other modulation options didn’t yield any
results either.

Good luck. I’ll continue to try to figure out what’s going wrong as
well.

Dev_R · October 1, 2007, 12:06am

Dev, Johnathan and All,

After re-reading Dev’s emails about 64 vs 32 bit and realizing that
gr_mpsk_receiver_cc.cc has not changed in some time, I took another look
at
this.

On my build gr_fir_ccf->filter() was not functioning properly. This is
called be the gri_mmse_fir_interpolator from gr_mpsk_receiver_cc. Note
that
my build was using the _simd code.

The symptom was the interpolator was returning the real part of one
sample
with the imaginary part of another sample. This explains why my earlier
“patch” appeared to “fix” the problem.

My final solution was to do a make clean and configure with
./configure --with-md-cpu=generic
this forced the use of the *_generic instead of the *_simd

With this change ./benchmark_loopback.py seems to work with
-m dqpsk
and
-m dbpsk

I have not spent any time looking into why the *_simd code did not seem
to
work.

Tim

Dev_R · October 1, 2007, 6:50pm

Tim,

I reconfigured both my 32 bit laptop and desktop to use the generic,
and they both work now with benchmark_loopback. Unforunately, I still am
unable to communicate between any USRPs with the benchmark_tx and _rx
scripts. Thanks for all your work on this. When I have some free time
I’ll try to figure out whats wrong with the tx and rx scripts.

Dev

Dev_R · September 28, 2007, 9:47pm

Dev, Johnathan, and all

After spending some time looking through gr_mpsk_receiver_cc.cc I am not
sure how this
ever worked. Below is a patch that works for me. There are two minor
changes

I did not understand how the delay line for the interpolator could
work
as coded so I changed it to what made sense to me.
I changed the sign of the return value on the phase_error_detector.

Let me know if this patch works for you.

Q: Is this code “gr_mpsk_receiver_cc.cc” going to remain in the
baseline?
If so I will clean it up a bit, write
some QA code, and submit on the patch list.

Tim

Index: gr_mpsk_receiver_cc.cc

— gr_mpsk_receiver_cc.cc (revision 6559)
+++ gr_mpsk_receiver_cc.cc (working copy)
@@ -142,7 +142,7 @@
float gr_mpsk_receiver_cc::phase_error_detector_generic(gr_complex
sample)
const
{
//return
gr_fast_atan2f(sample*conj(d_constellation[d_current_const_point]));

return -arg(sample*conj(d_constellation[d_current_const_point]));

return arg(sample*conj(d_constellation[d_current_const_point]));
}

// FIXME add these back in an test difference in performance
@@ -220,9 +220,10 @@
sample = nco*symbol; // get the downconverted symbol

// Fill up the delay line for the interpolator

d_dl[d_dl_idx] = sample;
d_dl[(d_dl_idx + DLLEN)] = sample; // put this in the second half of
the
buffer for overflows
d_dl_idx = (d_dl_idx+1) % DLLEN; // Keep the delay line index in
bounds

for(int ii=0;ii<(DLLEN-1);ii++){
```
d_dl[ii] = d_dl[ii+1];
```
}
d_dl[DLLEN-1] = sample;
}

void
@@ -326,7 +327,7 @@
}

 if(i < ninput_items[0]) {

 gr_complex interp_sample = d_interp->interpolate(&d_dl[d_dl_idx],

d_mu);

 gr_complex interp_sample = d_interp->interpolate(&d_dl[0], d_mu);

 mm_error_tracking(interp_sample);     // corrects M&M sample time
 phase_error_tracking(interp_sample);  // corrects phase and

frequency
offsets

Dev_R · October 1, 2007, 8:44pm

Just tried this on my laptop which seems to be using whatever
non-functioning code Tim mentioned. Make check passed successfully.

Dev

Dev_R · October 1, 2007, 9:42pm

Make check does pass, and there appears to be QA code
(qa_gr_fir_ccf.cc). I
am not sure that
if the QA code actually gets called by a top level “make check” If you
would like me to look
into it I can but I suspect whoever wrote the QA code originally could
do it
a lot faster than me:-)

Dev_R · October 1, 2007, 7:11pm

Hi!

Does “make check” pass on your system when you set it to use SIMD? It
would
be interesting to know if this error is not found with the standard
tests.

Dominik

Dev_R · October 2, 2007, 1:42am

On Mon, Oct 01, 2007 at 03:40:40PM -0400, Tim M. wrote:

Make check does pass, and there appears to be QA code (qa_gr_fir_ccf.cc). I
am not sure that
if the QA code actually gets called by a top level “make check” If you
would like me to look
into it I can but I suspect whoever wrote the QA code originally could do it
a lot faster than me:-)

Yes, it does get called at “make check” time.

FWIW, it’s run by way of gnuradio-core/src/tests/test_all

It’s possible that there’s an alignment requirement that’s not being
honored at runtime. The low-level SSE code (fcomplex_dotprod_sse64.S)
requires that its input and taps be 16-byte aligned. gr_fir_ccf_simd
allocates 16-byte aligned buffers for the relevant buffers, so it
should be working OK. Perhaps one of you seeing the problem could
add an assert or two to confirm that the alignment is correct.

Eric

Dev_R · October 2, 2007, 6:46am

On Mon, Oct 01, 2007 at 06:07:51PM -0700, Tim M. wrote:

represents “forced alignment” from gr_fir_ccf_simd.cc
RCRC… OK
00RC… OK
0RCR… Not OK

Hmmm. Does it ever use the 0RCR case? I would expect only the first
two. It may be reusing the fff simd code which generates all 4
alignments for the taps, but I wouldn’t expect to see the 0RCR or 000R
input cases.

Q: Is my assumption of the additional requirement correct?

Q: I don’t think it will be easy to force the additional requirement with
the same trick used in gr_fir_ccf_simd.cc; do you agree?

I don’t see that this as an additional constraint.
gr_complex == std::complex is always laid out (,).
sizeof(gr_complex) == 8, so with 16-byte alignment, we still always
have good alignment. Are you seeing a case where the input has the
real on a mod 8 == 4 boundary instead of a mod 8 == 0 boundary?

If so, (1) where’s the input data coming from, (2) what version of the
compiler are you using?

However, back to your first point, if we are using the 0RCR case, then
the code is completely wrong, and I don’t see how it could ever pass
the QA tests (which it seem to). On the other hand, there could be
some problem with how the float taps are mapped across the complex
input (It’s been along time since I looked at the code…)

Thanks for looking at this!

Eric

Dev_R · October 2, 2007, 3:08am

Eric,

The QA code (qa_gr_fir_ccf.cc) forces a 16 byte alignment. When the
malloc16Allign is replaced with a regular malloc in the QA code, make
check
fails.

I believe that there is an additional requirement that the data passed
to
the low-level SSE code have the real sample start on the 0th or 2nd 4
byte
float. For example the R / C represents 4 byte floats (Real, Complex) ,
0
represents “forced alignment” from gr_fir_ccf_simd.cc
RCRC… OK
00RC… OK
0RCR… Not OK

Q: Is my assumption of the additional requirement correct?

Q: I don’t think it will be easy to force the additional requirement
with
the same trick used in gr_fir_ccf_simd.cc; do you agree?

Tim