Gnu Radio apps freezes (locks up)

Hi list,

After I upgraded to latest Gnu Radio 3.5.2, and latest UHD (and images),
GR applications just freeze when running. No warnings, error messages or
overflows etc. Just freeze.
A simple FFT plot directly on received samples from the USRPN210 just
freezes after some seconds, or minutes (depending on the sample rate),
although the load on the machine is not high. Need to kill GR-app and
restart it, with the same problem occurring again.

This has never been a problem with earlier versions of GR/UHD, from
about 6 months ago.
The freezing happens quicker with high sample rate setting but also with
lower, eventually. No overflows happen (which was possible to get before
with too high sample rates or load, etc.)

The USRPN210 stops sending samples to the computer at the same moment as
GnuRadio freezes (as observed on the system monitor).

Same thing happen on two identical laptops running Ubuntu 10.10 (also
upgraded it from 10.04). Not sure if its a strict GnuRadio problem
(since it worked before), UHD, or some problem with the Ubuntu Linux
10.10. It work(ed) flawlessly with another machine on OSX (before I
tried to upgrade GR on it but then got stuck…) with identical UHD
version and images.

Installation of UHD+GnuRadio with the automatic linux script runs
without any problems, as before, no errors or warnings.

Any de-freezing help or clues appreciated!

Rickard

On Wed, Mar 28, 2012 at 10:02 AM, Rickard R. [email protected]
wrote:

Same thing happen on two identical laptops running Ubuntu 10.10 (also upgraded
it from 10.04). Not sure if its a strict GnuRadio problem (since it worked
before), UHD, or some problem with the Ubuntu Linux 10.10. It work(ed) flawlessly
with another machine on OSX (before I tried to upgrade GR on it but then got
stuck…) with identical UHD version and images.

Installation of UHD+GnuRadio with the automatic linux script runs without any
problems, as before, no errors or warnings.

Any de-freezing help or clues appreciated!

Rickard

Rickard,

Just to be clear. When you install 3.5.2 from the tarball, it freezes.
When you use the build-gnuradio, everything works fine?

What’s your machine?

Tom

On Mar 29, 2012, at 3:26 AM, Tom R. wrote:

Just to be clear. When you install 3.5.2 from the tarball, it freezes.
When you use the build-gnuradio, everything works fine?

What’s your machine?

Tom

Tom,

The installation with build-gnuradio script works just fine, as before
(no tarballs).
Same result on both laptops, Acer Aspire TimelineX with i3 processors
(2.26 GHz), running Ubuntu 10.10.
I did not have this problem earlier with Ubuntu 10.04. Or on a Mac with
OS X (with the source from git). Could Ubuntu 10.10 cause the problem
somehow?

Note: Halting/freezing only happens when running an application (with
the N210) as a receiver. (Not transmitter, see below.)
The flow of receiving samples just halts after a while and the
application freezes/halts (as a consequence).
This happen sooner with high sampling rate (after a few seconds with
25MSPS), but eventually also with a bit lower sample rates. The CPU’s
are not overwhelmed (< 50%).
It happens even if the UHD usrp source is connected directly to a null
sink only. I do not get any overflows before the halt.
In fact, I cannot even provoke a continuos stream of overflows since the
reception just halts instead of producing overflows, which was the
result earlier.
GnuRadio itself does not freeze as a whole (like the grc in the
background), just the running application, which I then need to abort.

Strangely enough, this “freezing/halting” does NOT happen when
transmitting, correspondingly, even with high transmit sample rates such
as 25 MSPS (or now possible even with 50MSPS with 8 bit/samples!). Then
it works just fine - even without underruns (when using just a file
source).

/ Rickard

On Thu, Mar 29, 2012 at 6:04 AM, Rickard R. [email protected]
wrote:

The freezing happens quicker with high sample rate setting but also with
lower, eventually. No overflows happen (which was possible to get before with too
high sample rates or load, etc.)

Strangely enough, this “freezing/halting” does NOT happen when transmitting,
correspondingly, even with high transmit sample rates such as 25 MSPS (or now
possible even with 50MSPS with 8 bit/samples!). Then it works just fine - even
without underruns (when using just a file source).

/ Rickard

Rickard,

Thanks for the data. Unfortunately, I have no idea what to make of
this. There isn’t that much difference between the last release
(3.5.2.1) and what you get using the build-gnuradio script. That just
grabs the latest master version from our git repo, and we haven’t done
all that much since the release. That really doesn’t make a lot of
sense.

How’s your git? If you’re comfortable doing so, can you check out the
v3.5.2.1 tag on git and try that one instead of the tarball release?

Thanks,
Tom

On 31 mar 2012, at 15.43, Tom R. wrote:

This has never been a problem with earlier versions of GR/UHD, from about 6
months ago.

Rickard
Tom,
GnuRadio itself does not freeze as a whole (like the grc in the background),
just the running application, which I then need to abort.
this. There isn’t that much difference between the last release
(3.5.2.1) and what you get using the build-gnuradio script. That just
grabs the latest master version from our git repo, and we haven’t done
all that much since the release. That really doesn’t make a lot of
sense.

How’s your git? If you’re comfortable doing so, can you check out the
v3.5.2.1 tag on git and try that one instead of the tarball release?

Thanks,
Tom

Thanks for your info. My current take on this interrupt issue:

The problem may not be gnuradio or uhd’s “fault”, I now suspect. Instead
somehow the network connection and its settings might cause the
interrupts. However, I am no linux guru so I am learning at the same
time as I am doing.

First, I’ve updated Ubuntu, from release 10.10 to 11.04, but then
nothing worked (just got a completely blank screen without any gui), so
fast-forward upgraded to 11.10 and then both laptops came right back on
track. I then also updated the gnuradio+uhd to latest (3.6.0) version
using the excellent build-gnuradio script (as before), which itself went
just fine (also as before).

Unfortunately, the exact same problem with interrupts (total halt) in
the receiver at high sample rates persists. Note, as before, this
happens (only?) with very high rates over the Ethernet (about 400 mbit/s
or more, the higher rate the sooner it halts, typically just a few secs)
although the computer display no overruns or other errors.

Then by jacking around with the MTU setting (100,500, 1500,5000, etc),
increasing the default too low (and initially also gnuradio complaining)
buffer settings (net.core.rmem_max, net.core.wmem_max), disabling the
Ubuntu network manager (via the menu) and removing the automatic network
configuration when USRPN210 connects and instead setting up the network
connection manually with “sudo ifconfig eth1 192.168.10.1” , I
sometimes can get the Ethernet connection into a state to work with
the N210 at high sampling rates without any interruptions at all !

In that case, a beautiful continuous flow of samples to the crunching
computer (like a fft-plot), at highest possible rate 50 MSPS, 8
bit/samples over the wire. This can happen with a MTU above 1500 (or
more), buffers increased to recommended settings by UHD, and when this
works the Ubuntu system manager tells me that some 834 Mbit/samples
flows through the Ethernet cable, and about 50% load on the CPU-cores,
very nice. Then it also works for repeated runs, not just one “lucky”
one, and for a prolonged period of time (more than an hour). In the
working network state I can also easily provoke nice expected overruns,
strings of ‘ooooooooooo’:hs, which isn’t possible when the Ethernet
connection is in the “wrong” and “interrupted” state - since then it
just halts/stops without further info.

However, finding this working network state is not just a matter of
setting the particular network parameters as I hoped it to be… I
suspect some other things are happening behind the scenes, maybe some
other settings etc (I do not yet have full knowledge of ethernets full
functionality and behavior, there may be more influencing parameters ?)
which prevent me finding the working network state in a consistent
manner. Quite weird.

I have USRP-N210 rev2 (says sticker on the back) but now noticed that
when I burned the latest fw and fpga images with the net burner tool it
prints “Hardware type: n210_r3” although I selected the fpga version
image for rev2 corresponding to my version. Could this be related to my
issue? Haven’t noticed that inconsistent message earlier, though.

If some of this rings any bells, please give me some further advise.
Sorry for long post.

Thanks,
Rickard

On Tue, Apr 10, 2012 at 1:16 PM, Rickard [email protected] wrote:

Hi list,
about 6 months ago.
work(ed) flawlessly with another machine on OSX (before I tried to upgrade

(no tarballs).

This happen sooner with high sampling rate (after a few seconds with
Strangely enough, this “freezing/halting” does NOT happen when
Thanks for the data. Unfortunately, I have no idea what to make of
Tom
worked (just got a completely blank screen without any gui), so

computer (like a fft-plot), at highest possible rate 50 MSPS, 8 bit/samples
However, finding this working network state is not just a matter of
rev2 corresponding to my version. Could this be related to my issue?
Haven’t noticed that inconsistent message earlier, though.

If some of this rings any bells, please give me some further advise.
Sorry for long post.

Thanks,
Rickard

My first guess is that network-manager sucks, and it’s cutting your
connection. I’ve seen this on Ubuntu on a semi-regular basis even on my
own
laptop, when using ifconfig to set the adapter address manually,
sometimes
network-manager decides to bring the interface down just because. To
diagnose this, look to see the network light on the N210 go out after
the
samples stop coming. The solution is to go into the network
configuration
and set eth0 (or whatever network adapter it is) to use a static IP
address
instead of “auto”.

–n

On Apr 10, 2012, at 11:55 PM, Nick F. wrote:

Installation of UHD+GnuRadio with the automatic linux script runs without
any problems, as before, no errors or warnings.
What’s your machine?
The flow of receiving samples just halts after a while and the application
freezes/halts (as a consequence).

v3.5.2.1 tag on git and try that one instead of the tarball release?
First, I’ve updated Ubuntu, from release 10.10 to 11.04, but then nothing worked
(just got a completely blank screen without any gui), so fast-forward upgraded to
11.10 and then both laptops came right back on track. I then also updated the
gnuradio+uhd to latest (3.6.0) version using the excellent build-gnuradio script
(as before), which itself went just fine (also as before).

If some of this rings any bells, please give me some further advise.
Sorry for long post.

Thanks,
Rickard

My first guess is that network-manager sucks, and it’s cutting your connection.
I’ve seen this on Ubuntu on a semi-regular basis even on my own laptop, when using
ifconfig to set the adapter address manually, sometimes network-manager decides to
bring the interface down just because. To diagnose this, look to see the network
light on the N210 go out after the samples stop coming. The solution is to go into
the network configuration and set eth0 (or whatever network adapter it is) to use
a static IP address instead of “auto”.

–n

Thanks for your suggestions.

Yes, this sounds quite related to what I also experience. But in this
case when the samples stop coming, the orange led on the N210 (upper
right side) goes from constant light up to a high frequency blinking,
until I abort the gr-script and the light goes completely out. The IP
address typically stays put, so I don’t have to re-enter it with
ifconfig again after the samples stops coming. Sometimes, however, it
happen what you explain (it loose its ip address) but it seems to be
uncorrelated with the sudden interrupts I have…

I have tried what you suggest to disable the automatic connection in the
network manager. It seems to become a bit better, the stop usually comes
later (its a random phenomenon), but eventually it stops nevertheless.

It might be the actual hardware driver for the Ethernet card which is
causing the troubles, but further diagnosis is necessary for my patient.
Found on “the internet” a thread describing quite similar symptoms as my
patient suffer from:

The Ethernet interface is labeled “Atheros Communicaitons AR8151 v1.0
Gigabit Ethernet” but I haven’t found the appropriate linux drivers and
how to install those yet.

/ Rickard

Hello Rickard,

Could you get a permanent or working solution of your problem? I think
that
I am facing a similar issue as you did.

In my application, the receiver USRP senses the channel in a repeated
on-off manner. The receiver USRP run for 200 msec, gathers the complex
data
in some files, does some processing, plots the data and waits for my
input.
As I press a button, the USRP starts again and repeats the whole
process. The problem is: if I run the code at a very high rate (20 MS/s
with 32 bit/complex sample), sometimes the application freezes up or
hangs
without giving any error. Currently, this happens within 15-20 starts.
This
problem occurs less frequently if I run it at lower sampling rates (e.g.
2
MS/s).

I am using USRP N210, the latest GNUradio & UHD drivers, Ubuntu 12.04. I
believe that my issues are quite similar to the ones that you faced. It
would be great to know if you found a (somewhat) working solution.

Thanks,

Nazmul

On Fri, Apr 13, 2012 at 10:59 AM, Rickard R.
[email protected]wrote:

On Mar 29, 2012, at 3:26 AM, Tom R. wrote:
although the load on the machine is not high. Need to kill GR-app and

Any de-freezing help or clues appreciated!
Tom

the reception just halts instead of producing overflows, which was the

/ Rickard

The problem may not be gnuradio or uhd’s “fault”, I now suspect. Instead
Unfortunately, the exact same problem with interrupts (total halt) in the
connection manually with “sudo ifconfig eth1 192.168.10.1” , I sometimes
period of time (more than an hour). In the working network state I can
Quite weird.
Thanks,
instead of “auto”.
goes from constant light up to a high frequency blinking, until I abort the
It might be the actual hardware driver for the Ethernet card which is


Discuss-gnuradio mailing list
[email protected]
Discuss-gnuradio Info Page


Muhammad Nazmul I.

Graduate Student
Electrical & Computer Engineering
Wireless Information & Networking Laboratory
Rutgers, USA.

Hi list,

This is a very inconsistent and stubborn problem!

I’ve tested the Ethernet connection with the “iperf” testing tool. I
used the same Ethernet cable but connected directly between the two
laptops (see below) which have the UHD Ethernet issues. I have also
reinstalled Ubuntu to a fresh 12.04 and used latest UHD.

With a careful tuning of both the UHD buffer size and packet size, UDP
packet streaming works according to iperf up to about 920 Mbit/s without
any packet losses in either direction. Please see transcripts in
attached file. Running the iperf test with TCP instead of UDP gives
similar results.

This should verify that the Ethernet cards in the computers are sound
and works also at very high rates close to the gigabit limit.
Note that the Ethernet in the computers thus works for higher rates than
what UHD requires at maximum speed (which is about 850 Mbit/s incl.
overhead at 25MSPS)

However, despite this fact I cannot get rid of the “halting problem”
with the UHD and the transmission freezes although I try to use the same
buffer and frame length parameters as in the iperf test. See attached
file. I have also tried many other combinations of the frame length and
buffer size but without any luck.

It seems like something weird is going on with UHD at high sample rates
which some Ethernet cards, like in these computers, cannot handle.

Can anyone explain this inconsistent behavior or even better offer a
working solution?
Please see the transcripts in the attached text file.
Also, what does “Unexpected error on recv, continuing… Error code: 1”
mean or indicate?

/ Rickard

v3.5.2.1 tag on git and try that one instead of the tarball release?

If some of this rings any bells, please give me some further advise.
Sorry for long post.

Thanks,
Rickard

Rickard:

I have run 50Msps/8-bit for hours on end with my system here, which is
Fedora 14, and using cheap RTL-based network cards.

There is a known hardware problem with certain Intel 1GiGe
chipsets–the 82577LM and 82579LM have problems that sound similar
to what you’re reporting at high sample rates.


Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

However, despite this fact I cannot get rid of the “halting problem” with the
UHD and the transmission freezes although I try to use the same buffer and frame
length parameters as in the iperf test. See attached file. I have also tried
many other combinations of the frame length and buffer size but without any
luck.

It seems like something weird is going on with UHD at high sample rates which
some Ethernet cards, like in these computers, cannot handle.

Can you confirm that its not just the gui locking up?

Sometimes compositioning or just funny GL setups can cause the FFT
plotter to have horrible rendering. Or just lower the frame rate.

Can anyone explain this inconsistent behavior or even better offer a working
solution?
Please see the transcripts in the attached text file.
Also, what does “Unexpected error on recv, continuing… Error code: 1” mean or
indicate?../…/qr2/host/grc_examples/qr2_1ch_compass.grc

Its a timeout. If something was lost due to overflow, the recv call will
timeout trying to obtain the lost samples. Its just the way the example
was written.

-josh

Can you confirm that its not just the gui locking up?
Sometimes compositioning or just funny GL setups can cause the FFT
plotter to have horrible rendering. Or just lower the frame rate.
Both (GUI and rendering) lock up. I suspect the same as you “funny GL
setups can cause the FFT
plotter to have horrible rendering”.
Some times the uhd_fft (using ubuntu) window changes from gray to white
(disabled -> enabled) every 5 sec

Patrik

As reference to this issue, using F12 and slow dual core does not ever
freeze even at USRP1 max sample rate. We’re using old traditional GR
(usrp_fft.py). Last computer + USRP1 boot was in September (boot every
2-3 month). We receive 24/7 2/4/8 MHz bandwidth.

Using ubuntu and udh we do see problems.
Something funny has happened in graphics drivers, we suspect.

Patrik

On 11/25/2012 11:36 PM, Patrik T. wrote:

As reference to this issue, using F12 and slow dual core does not ever
freeze even at USRP1 max sample rate. We’re using old traditional GR
(usrp_fft.py). Last computer + USRP1 boot was in September (boot every
2-3 month). We receive 24/7 2/4/8 MHz bandwidth.

Using ubuntu and udh we do see problems.
Something funny has happened in graphics drivers, we suspect.

The newer ubuntus enable all kinds of effects that use compositing.
After disabling every graphical “effect” on my kubuntu installation,
both unreal tournament and fft sink have much better frame rates.

-josh

10-4, we’ll try.

Thanks,
Patrik

Hi,

We see also very often uhd_fft.py freezing even at low symbol rates <=
2msps. However, if our own receiver software is used we very seldom see
a freeze.

I suspect this freeze issue could be related to rendering?

Patrik

On 26 nov 2012, at 07:28, Josh B. [email protected] wrote:

Can you confirm that its not just the gui locking up?

Negative. Its not just GUI locking up.
In fact GUI does NOT lock up itself but the rendering (fft plot)
“freeze” simply because no more samples are delivered from the N210
device.
GUI buttons and sliders are still responsive/operational but of course
have no effect since no more samples are received.
Probably my original description “freeze” was a bit misleading. Its
rather a “complete stall” or “full stop” which leads to a gui-rendering
(plotting) “freeze” due to missing samples.

Even if I just run an example with UHD–>Null sink (that is without any
rendering GUI) I get the exact same problem that samples stop coming
after a short period of time.
Or if I just try to save the samples to a file it stalls/stops and I can
only save relative few samples.

I think what Patrik describes in his responses is probably a different
issue which I sometimes also have experienced, but relates to the GUI,
not necessarily the UHD communication.
In my case, the samples just stops coming from the device so I believe
its UHD related.

IMPORTANT NOTE: The green LED labeled C on the N210 goes from a steady
ON (when samples are delivered) to suddenly OFF at the instant when
samples stop coming to the host computer.
Also, the large orange LED goes from a steady ON when samples are
streamed to very rapidly blinking at the instant when samples stop
coming.
Finally, not until I abort the stalled application the fast blinking
orange LED turns off completely again.

Its a timeout. If something was lost due to overflow, the recv call will
timeout trying to obtain the lost samples. Its just the way the example
was written.

That make sense to what I experience also with any other tests/examples
described above.
Timeout halt/stall/stop the application since samples suddenly stops
coming from the device to the host.

/ Rickard

On 26 nov 2012, at 13:43, Josh B. [email protected] wrote:

I think you can eliminate 1 and 2. Can you confirm 3 with a wireshark run?

Thanks. I will try but I am not so familiar with wireshark.
Just briefly tried to use it but it was not trivial to set up, as I
recall.

/ Rickard

On 26 nov 2012, at 13:56, Rickard [email protected] wrote:

IMPORTANT NOTE: The green LED labeled C on the N210 goes from a steady ON
(when samples are delivered) to suddenly OFF at the instant when samples stop
coming to the host computer.
unless the sample rate is > 25e6 and the wire format is still sc16

-josh

I have now played with the wireshark.

I do not get what you suggest “ICMP destination unreachable packet” or
something similar.
The only “ICMP” protocol related is when I connect the device and
setting up the ip address, but no “unreachable packets” or similar
during the uhd run.
When running there are only UDP frames/protocol.

Instead, however, I discovered some really suspect behavior with the
ports changing wildly back and forth on both the device and host, and
UDP packet/frame size then change much too.
This happens both in the beginning of the streaming (see attached
packets 241–287), then after a while it settles to the requested
(constant) packet size (3050 bytes, close to the requested 3008 bytes)
and the ports becomes fixed (see packets 1271-1277) .

But then in the end, when it all fails, the ports on the device and host
suddenly change again and the packets becomes very small, like 58 or 60
bytes only (see packets 211078 --).
The little actual data in those failing packets seem quite odd too.

Please find the transcript of the mentioned and seemingly crucial
packages from the beginning and the end when the UHD communication
fails. Note the selected packets’ numbering.
The wireshark captured the command:
uhd_fft -a “addr=192.168.10.2, recv_frame_size=3008” -s 25e6

Does this give any clues?

/ Rickard

<<Another question, what the app locks up, what makes it recover? Reload
app, power cycle user, replug eth cable, re ifconfig, restart PC?>>

In my case, I had to press ctrl+C and then restart the code to make it
work.

Thanks,

Nazmul

On Tue, Nov 27, 2012 at 1:13 AM, Josh B. [email protected] wrote:

ports changing wildly back and forth on both the device and host, and

#define USRP2_UDP_UART_BASE_PORT 49170
unspecified/aka left default MTU?


Muhammad Nazmul I.

Graduate Student
Electrical & Computer Engineering
Wireless Information & Networking Laboratory
Rutgers, USA.