Here's a little beauty

luislavena · July 6, 2011, 3:15am

Relatively simple flow-graph, not complete yet by any stretch of the
imagination.

It starts out innocuous enough, but really gets going after a while.
The RSS grows by about 150M/minute, the Virtual Size at a somewhat
slower pace. After a few minutes of running, its RSS has grown
enough that it’s consuming most of the physical memory on the system,
and the kernel is starting to look for solutions to the problem, the
virtual size grows to about 6.2Gbyte on my system before things get
utterly unusable, and I have to go to the console and kill it off.
If I reduce the bandwidth, it grows more slowly, but it still grows,
and grows,
and grows, and grows.

I was even well-behaved in my FFT size–4096 bins, which is a nice even
multiple of the page-size and everything.

The Gnu Radio memory behaviour is really starting to bug me. A lot. So
much so that I’m contemplating going directly from UHD to my
application (in this case, a multi-channel riometer). I really don’t
want to do that. If I had the time to dedicate to it, and the required
depth
of knowledge of the scheduler guts, I’d fix it myself, and post
patches. I understand that the memory tricks are stream-performance
“optimizations”, but it’s rather unoptimal when your system is eaten
alive by stuff that, at least given a superficial glance, should be
fairly innocuous (except, perhaps for the 25Msps bit, which nobody
would argue is “casual” by any measure).

Marcus_DSLeech · July 6, 2011, 3:41am

On Tue, Jul 5, 2011 at 6:13 PM, Marcus D. Leech [email protected]
wrote:

reduce the bandwidth, it grows more slowly, but it still grows, and grows,
I understand that the memory tricks are stream-performance

Discuss-gnuradio mailing list
[email protected]
Discuss-gnuradio Info Page

Hi Marcus,

What are you using the vector sink for, I can’t find anything that
unloads
it? If you look at the source code, this block continuously calls
“push_back” on a STL vector container (element size is the GNURadio
vector).
So if nothing ever empties it, then it should consume all the memory
space…?

–Colby

Marcus_DSLeech · July 6, 2011, 3:49am

–Colby
Yes, I just (like three minutes ago), realized that the vector sink
was causing all the memory-leak grief. It was part of an incomplete
thought.

So, I’ll back off a few dB on my Gnu Radio memory management diatribe,
but only a few dB

–
Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · July 6, 2011, 3:59am

–Colby
What I’m trying to do is do a cheap (in the computational sense)
multi-channel power estimator, and I’m running out of options.
I need to be able to carve out up to four variable-width,
non-uniformly-spaced channels anywhere in the 20Mhz to 45Mhz region,
which is 25Mhz of bandwidth. I tried four conventional bandpass
filters, followed by the usual power-detector sequence
(complex-to-mag**2/IIR-filter/keep-one-in-n). That produced a lot
of overruns. I then tried the same thing, using FFT filters instead
of the usual FIR filters. That was no better.

So then I thought, maybe an FFT of suitable size, and I can simply “pick
off” the bins I need that correspond to my channels of interest.
I could only make that work by decimating the FFT input vectors by a
factor of 3, then integrating the outputs. Similar to what the
FFT graphical display does. But, it does chug along producing those
output vectors. The question is, how to efficiently turn them into
something I can use for per-channel power estimates, within the
confines of a GRC-produced flow-graph.

–
Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · July 6, 2011, 4:11am

On Tue, Jul 5, 2011 at 6:58 PM, Marcus D. Leech [email protected]
wrote:

–Colby

–
Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org

Have you tried the polyphase filter channelizer blocks?

–Colby

Marcus_DSLeech · July 6, 2011, 4:15am

Have you tried the polyphase filter channelizer blocks?

–Colby
Not clear to me how to use them to effect non-uniformly-spaced channels.
Also, individual channels will have their own bandwidths.

Marcus_DSLeech · July 6, 2011, 4:52am

On Tue, Jul 5, 2011 at 7:14 PM, Marcus D. Leech [email protected]
wrote:

Also, individual channels will have their own bandwidths.

–
Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortiumhttp://www.sbrac.org

Perhaps using wavelets? The idea is to look at a much more “coarse”
piece of
spectrum until there is something of possible interest (cheap to
compute).
Then zoom in when something pops up (expensive to compute).

Marcus_DSLeech · July 6, 2011, 5:21am

On Jul 5, 2011, at 10:50 PM, Colby B. wrote:

Not clear to me how to use them to effect non-uniformly-spaced channels. Also,
individual channels will have their own bandwidths.
See also: “Polyphase Filter Banks For Unequal Channel Bandwidths And Arbitrary
Center Frequencies” by fred harris et.al… I don’t know of anyone who’s
implemented this yet; and, really IIRC, the complexity is absurd. But it might
work - MLD

Discuss-gnuradio mailing list
[email protected]
Discuss-gnuradio Info Page

Thanks. I did try the polyphase channelizer briefly, using a simple
low-pass filter, and it provoked roughly as many 'O’s as any of the
other
approaches. I think I may need a faster CPU for handling 25Msps, no
matter how I do it.

–
Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · July 6, 2011, 5:04am

On Jul 5, 2011, at 10:50 PM, Colby B. wrote:

Not clear to me how to use them to effect non-uniformly-spaced channels. Also,
individual channels will have their own bandwidths.

See also: “Polyphase Filter Banks For Unequal Channel Bandwidths And
Arbitrary Center Frequencies” by fred harris et.al… I don’t know of
anyone who’s implemented this yet; and, really IIRC, the complexity is
absurd. But it might work - MLD

Marcus_DSLeech · July 6, 2011, 6:13am

What sort of CPU are you using?

–Colby
AMD Phenom II X6 1055T, with 6GB of 1333MT/s memory. Rough ballpark
calculations show
me that even a 4096-bin FFT shouldn’t take more than about
0.45GFlop/sec at 25Msps, and the
CPU is easily capable of at least 8GFlop/sec/core. So I’m not sure
why it’s baffing at 25Msps.

I’ve tried using both the on-mobo 1GiGE interface, and a PCI-resident
one. Neither of those makes
any difference to getting large numbers of ‘O’ at 25Msps.

If I decimate by 3 or more before the FFT (after vectorizing), I runs
OK, consuming about 40% of the
total system CPU, and not producing any ‘O’. I could then,
theoretically, process the FFT output vector
to extract only the magnitudes of the bins that correspond to my
channels of interest. But
decimating by 3 means that I’m losing sensitivity by a factor of
sqrt(3), which I’d rather not have
to “swallow”, the application is for radio astronomy, where
sensitivity is quite important.

–
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · July 6, 2011, 6:27am

I can borrow my lab mates N210 and see what kind of performance I can
get
out of it on my T410 Thinkpad (i7 proc).

Marcus_DSLeech · July 6, 2011, 5:20pm

Marcus D. Leech wrote on 7/6/2011 6:12 AM:

If I decimate by 3 or more before the FFT (after vectorizing), I runs
OK, consuming about 40% of the
total system CPU, and not producing any ‘O’.

Have you tried to find the blocks consuming the power? I once used
oprofile, and it worked quite well. You will get a list of functions
that consume cycles, by themselves and culminated over all function
calls. With that list you can search the source code and find the
problematic blocks.

Patrick

Engineers motto: cheap, good, fast: choose any two
Patrick S.
Student of Telemati_cs_, Techn. University Graz, Austria

Marcus_DSLeech · July 6, 2011, 5:38am

What sort of CPU are you using?

–Colby