How to balance workload among cores? Laptop cannot keep up with the USRP2 data flow

Hi all!

I am Tx/Rx a FM signal at the same time, but I can see a lot of SSSSS
messages that mean my CPU cannot keep up with all frames generated by
the
USRP2 and drop most of them.
However I have a quite new laptop with a CPU:Intel® Core™ i5 CPU
M 520 @ 2.40GHz on Ubuntu 10.4

I guess that my laptop is enough to run my FM Tx/Rx and I wonder if I
can
balance the workload among all my four cores to see if my application
improves (ethernet interface doens’t drop any frame).

Looking at other post of the mailing list I found:

1)“Try using numactl”.
Well, for me is not possible, after installing that library and running:
:~$ numactl -s
physcpubind: 0 1 2 3
No NUMA support available on this system.

2)“Look at the output of cat /proc/cpuinfo “physical id” indicates which
cores are in which sockets”
In all four processors physical id : 0

Furthermore whe I execute top command I can see:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3022 root 20 0 238m 66m 45m S 166 3.6
3:32.12 python

Is my GRC file only executing in one core? How can I find it out?
Any idea to fix it?

Many thanks,
Jorge

application improves (ethernet interface doens’t drop any frame).
which cores are in which sockets"

Many thanks,
Jorge

CPU scheduling is generally the responsibility of the operating system
kernel, NOT Gnu Radio. What Gnu Radio does is schedule
blocks to threads, and the execution of those threads, and CPU
assignment is up to the kernel. In general, the kernel does a good
job of CPU scheduling.

Your problem is probably that your flow-graphs aren’t optimal in some
way, and thus take more system resources than they should.

The other issue may be your GiGE network interface might be one that
isn’t very good in the buffering department, and thus makes the
system work much harder than it should handling continuous packet
traffic.

What bandwidth are you using? (That is, what decimation/interpolation
are you using)?

You haven’t shared much in the way of details about your flow-graph, and
I suspect that’s where your problem lies.


Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org

On Mon, Sep 27, 2010 at 10:07:07AM +0200, Jorge M. wrote:

improves (ethernet interface doens’t drop any frame).

First off, you don’t really have 4 cores. You’ve got 2 cores +
hyperthreading (they may have renamed it, but that’s what you’ve got).

GNU Radio will automatically use whatever you’ve got, without you
having to do anything special.

Be sure that your laptop is in “Performance mode”, and not trying to
save energy, or throttle back etc. There are also some laptops out
there that have poor thermal design and can’t really run at full speed
without overheating and thus throttling back the CPU.

Start with something like usrp_fft.py and see how low of a decimation
factor you can work with reliably. That will give you a basic idea of
how your system is working.

If you’re building your own flow graphs, it’s quite easy to string
together more blocks, or use a a higher sample rate, than your machine
can keep up with.

After you’ve ruled out the stuff above, oprofile will help you sort
out which parts of your application are burning the most time.

Eric

On Thu, Sep 30, 2010 at 5:52 AM, Jorge M. [email protected] wrote:

ok by itself. The problem is when I run both in the same flowgraph. I think
dropped frames in my rx Ethernet interface (saw with ifconfig). I tried to
increase the RX Ethernet buffer (sysctl -w net.core.rmem_max=XXXXX) but it
didn’t work. I also tried to set in my generated code the USRP source
instance before the USRP sink but it didn’t work neither.

Could be any reason why by decreasing software computation (and Ethernet
traffic as well) in the transmitter path affects the performance of the RX
path?

Many thanks,
Jorge.

This doesn’t sound like a performance issue. I think you’re probably
looking at a sample rate mismatch somewhere.

Tom

Tom,

This doesn’t sound like a performance issue.
I absolutely agree.

I think you’re probably looking at a sample rate mismatch somewhere.
If it were the case. Don´t you think that FM transmitter by itself and
FM
receiver by itself shouldn´t work neither?
The point is that both of them when running alone work. That is the fact
I
do not understand.

Regards,
Jorge.

Hi !

Eric, thanks for your advise. Some results here.

I set my computer into “performance mode”. At first sight nothing
improves.

With the example usrp2_fft.py I can low the decimation to 5 and the
application works ok.

As I said the FM modulator works ok by itself. The FM demodulator also
works
ok by itself. The problem is when I run both in the same flowgraph. I
think
it is not a problem of performance since my CPUs are not very
work-loaded
when my application is running.

My problem is that changes, in the FM modulator, which I think they
would
increase performance (like increasing my 13 interpolation in the USRP
and
decreasing my software interpolation) make my receiver chain non-working
(SSS messages) although the transmitter outputs a perfect FM signal.

I do not understand these mesages since my CPUs are less than 15-20%
used.
However something is stopping the received data because I got a lot of
dropped frames in my rx Ethernet interface (saw with ifconfig). I tried
to
increase the RX Ethernet buffer (sysctl -w net.core.rmem_max=XXXXX) but
it
didn’t work. I also tried to set in my generated code the USRP source
instance before the USRP sink but it didn’t work neither.

Could be any reason why by decreasing software computation (and Ethernet
traffic as well) in the transmitter path affects the performance of the
RX
path?

Many thanks,
Jorge.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs