Re: Correct method for "compressing" a power spectrum

fmalem · March 9, 2009, 7:38am

Hi,

From: Marcus D. Leech [email protected]

Let’s say I have an FFT output that’s many, many, bins wide, and I want
to compress that information into a narrower display (let’s say from 4M > bins down to 1024 bins).

My approach has been to sum up each set of [4M/1024] bins, and use that
as the final output.

Marcus L.

I think adding FFT bins is the correct way. But, for me I always select
the maximum in these bins. I’m not sure that this is right but it gives
me a good look over my spectrum data.

Best Regards,

Firas

fmalem · March 9, 2009, 8:46am

I think adding FFT bins is the correct way. But, for me I always select the maximum in these bins. I’m not sure that this is right but it gives me a good look over my spectrum data.
Spectrum Estimation is not trivial - I highly recommend having a look at
“Introduction to Spectral Analysis” by Stoica and Moses. A mathematical
yet readable introduction.

To solve you problem in a mathematically way, you can use windowing
in the time domain with subsequent overlapping block averaging
(Welch’s method).

Gives a good practical estimate. Code for GNU Radio is here:

https://www.cgran.org/wiki/SpecEst

Integrating the FFT bins is just fine, but you probably loose frequency
resolution that way (if that doesn’t matter, just leave it).

Best regards,
Jens

fmalem · March 9, 2009, 2:01pm

On Mon, Mar 9, 2009 at 3:45 AM, Jens E. [email protected] wrote:

Gives a good practical estimate. Code for GNU Radio is here:

https://www.cgran.org/wiki/SpecEst

Good to see CGRAN being put to good use!

Integrating the FFT bins is just fine, but you probably loose frequency
resolution that way (if that doesn’t matter, just leave it).

As per the original question, I am wondering what the reason for
taking the time to do a 4M point FFT and then compressing the
frequency information? I would think you’re just throwing away
frequency resolution at that point and, inherently, wasting CPU
cycles.

So - back to Marcus - what is the overall goal of “compressing” your
FFT data? I have a feeling it’s for visualization of your data as
opposed to actually performing math or computational analysis, right?

If that is the case, I think you would want to come up with a clever
way of visually compressing the original data without compromising the
frequency resolution you’re computing. I’m not sure if anyone has
done this before, but you could possibly overlay your lines on top of
each other - and use colors to determine how many of the bins overlap

this way if you have a spike in one bin, it will be instantly
recognizable and “zooming in” could reveal exactly which bin it was.
Moreover, if all the bins have a large frequency content to them,
that, too, would be instant recognizable as a wider bandwidth signal
as opposed to just a single Hz spike.

I suppose this would constitute a visual averaging without
computationally losing the frequency resolution you’ve worked so hard
to achieve.

If it is not the case that you want to visualize the data, and,
indeed, you just need to clump your bins together - please disregard
this entire e-mail and idea.

Brian

fmalem · March 9, 2009, 2:26pm

One thing you might look at is the “Rosenfell” algorithm used in some HP
spectrum analyzers (like the 8566). As I recall, they alternately take
the maximum and minimum values in the bins – ie, if you’re collapsing
three bins into one and the original values are

010 050 010|010 050 010|120 130 120|120 130 120|010 050 010|010 050 010

You’d end up with

050 010 130 120 050 010

The idea is that this looks more like the “grass” that you would see on
a non-digital analyzer and accentuates the difference between noise and
stable signal, where averaging the bins would lose that distinction –
it has the effect of removing indicators that you’re looking at noise.

Not sure if this might be applicable to the application, and not sure I
got the details just right from memory, but I think searching
“Rosenfell” would find more details.

John

fmalem · March 9, 2009, 2:50pm

Brian P. wrote:

opposed to actually performing math or computational analysis, right?

The original {1,2,4,8,16}M-bin data are used for SETI analysis, while
the “compressed” version
is used for a quick visual, “conventional” spectral display.

–
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium

fmalem · March 9, 2009, 3:29pm

On Mon, Mar 9, 2009 at 8:47 AM, Marcus D. Leech [email protected]
wrote:

The original {1,2,4,8,16}M-bin data are used for SETI analysis, while
the “compressed” version
is used for a quick visual, “conventional” spectral display.

In SETI analysis, is it more interesting to see a wider bandwidth
signal, or single frequency bins of larger magnitudes?

If you would rather see wider bandwidths, it might be interesting to
set each of N bins to an opacity of 1/N and draw them on top of each
other. That way the more that are drawn on top of each other appear
darker in appearance (wider bandwidth) as opposed to a single outlier
skewing all results. Moreover, for zooming, I think this would be the
most dynamic while maintaining full fidelity of the signal
representation.

I think there are a lot of good visualizations that can be done
without compromising the frequency resolution you have obtained with
such large FFTs.

I’d be interested to hear what method you end up using, and what is
important in your SETI analysis.

Brian

fmalem · March 9, 2009, 4:52pm

I suggest that the best algorithm for this would be the rank order mean
alternated with the max so long as you are going to insert “heuristic
grass”. So it would be max, ROM, max, ROM, …

Let [B1, B2, B3, B4, … BN] be powers of five adjacent bins.

Put them in rank order

[R1, R2, R3, R4, … RN]

If N is even, the rank order mean is (R_(N/2) + R_/(N/2 +1)*0.5.
If N is odd, the rank order mean is R_(N/2 +0.5)

But I still see a problem:

I suggest that in order to prevent “scalloping” of a swept tone across
this algorithm or any other like it, that some “bin” in the lossy
compression set of bins must ALWAYS be forced to take on the large of
the power spectrum otherwise your alternative min/max/min/max might jump
up and down as you sweep a tone through it. Or did I miss something?

Bob

John Ackermann N8UR wrote:

John

Discuss-gnuradio mailing list
[email protected]
Discuss-gnuradio Info Page

–
(Co)Author: DttSP, Quiktrak, PowerSDR, GnuRadio
Member: ARRL, AMSAT, AMSAT-DL, TAPR, Packrats,
NJQRP, QRP ARCI, QCWA, FRC.
“It is human nature to think wisely and act in
an absurd fashion.”, Anatole France.
Twitter:rwmcgwier
Active: Facebook,Myspace,LinkedIn

fmalem · March 9, 2009, 5:20pm

Brian P. wrote:

On Mon, Mar 9, 2009 at 8:47 AM, Marcus D. Leech [email protected] wrote:

The original {1,2,4,8,16}M-bin data are used for SETI analysis, while
the “compressed” version
is used for a quick visual, “conventional” spectral display.

In SETI analysis, is it more interesting to see a wider bandwidth
signal, or single frequency bins of larger magnitudes?

Ah, the very crux of the signal-detection arguments in SETI! The most
popular hypothesis is
that our ET friends will use as narrow a signal as possible, since
that concentrates all the transmitter
power over a single frequency. But there are other theories around
about what else ET might
do to let us know that they’re there. Modulating the microwave output
of the stellar corona, for example
(See SASER project). Something like that would be wideband, and not
terribly coherent.

such large FFTs.

I’d be interested to hear what method you end up using, and what is
important in your SETI analysis.

I have two different “visualizations” available. The “compressed”
spectrum is all about the “usual”
fairly-coarse spectrum you might want for mapping radial velocities of
neutral hydrogen, for example.
The full-resolution spectrum is used both for automated analysis, with
a companion visualizer using
a completely-conventional “waterfall” display, in which you can
display any given segment with
full resolution.

At very high resolutions, these waterfall plots produce some fairly
spectacular-looking stuff particularly
when the low-noise feed is sitting in an office full of computers,
pointing at the ceiling.

–
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium

fmalem · March 9, 2009, 5:10pm

On Mon, Mar 9, 2009 at 8:49 AM, Bob McGwier [email protected] wrote:

Let [B1, B2, B3, B4, … BN] be powers of five adjacent bins.

Put them in rank order

[R1, R2, R3, R4, … RN]

If N is even, the rank order mean is (R_(N/2) + R_/(N/2 +1))*0.5.
If N is odd, the rank order mean is R_(N/2 +0.5)

Here the rank-order mean is the median: the quantile value for
#quantiles
= 2.

But I still see a problem:

I suggest that in order to prevent “scalloping” of a swept tone across this
algorithm or any other like it, that some “bin” in the lossy compression
set of bins must ALWAYS be forced to take on the large of the power spectrum
otherwise your alternative min/max/min/max might jump up and down as you
sweep a tone through it. Or did I miss something?

Most surely. Why not some kind of VQ-codebook based on deltas between
successive frames? Apart from the amount of the research required to
build
it, of course…

Neural net fans invited to take this on also!

Frank

fmalem · March 9, 2009, 5:28pm

Bob McGwier wrote:

[R1, R2, R3, R4, … RN]
jump up and down as you sweep a tone through it. Or did I miss
something?

Bob
I love that phrase, Bob. “Heuristic grass”.

I think that the real answer is to “play with it, and see what best
suits the application”. Ranking the bin might end up being
expensive. I’d probably use qsort(), but I can’t remember what its
computational complexity is. Just looked it up. Worst case
is O(N**2), and best-case is O(NlogN). For worst-case, that’s rather
brutal at 4M bins (and even worse at my maximum of
16M bins).

This discussion has ended up being much more fascinating than I had
predicted. Keep it up everybody!

–
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium

fmalem · March 9, 2009, 9:34pm

Franks comments are right on median and ROM in this case. I stayed up
to 4 AM and went to work at after having breakfast and a cup of coffee
and arrived by 9. It is showing.

The entire gist of my comments amount to nothing more than don’t allow
“aliasing” of the spectral changes as your traverse the compressed power
spectrum to cause you to miss a peak that is ABOVE the detection
threshold.

So, pardon me but, is this a pretty picture exercise or a real
detection problem? If it is a detection problem, then you might as
well just compress to the largest value in the bins to be pushed
together so you assure that your threshold is exceeded when it should
be. If it is a pretty picture problem, just prevent scalloping by any
old heuristic, just make it as fleet of calculation feet as possible and
throw in some pretty grassy stuff to make it look nice.

Bob

Marcus D. Leech wrote:

the power spectrum otherwise your alternative min/max/min/max might
computational complexity is. Just looked it up. Worst case
is O(N**2), and best-case is O(NlogN). For worst-case, that’s rather
brutal at 4M bins (and even worse at my maximum of
16M bins).

This discussion has ended up being much more fascinating than I had
predicted. Keep it up everybody!

–
(Co)Author: DttSP, Quiktrak, PowerSDR, GnuRadio
Member: ARRL, AMSAT, AMSAT-DL, TAPR, Packrats,
NJQRP, QRP ARCI, QCWA, FRC.
“It is human nature to think wisely and act in
an absurd fashion.”, Anatole France.
Twitter:rwmcgwier
Active: Facebook,Myspace,LinkedIn

fmalem · March 10, 2009, 12:06am

Bob McGwier wrote:

Franks comments are right on median and ROM in this case. I stayed
up to 4 AM and went to work at after having breakfast and a cup of
coffee and arrived by 9. It is showing.

The entire gist of my comments amount to nothing more than don’t allow
“aliasing” of the spectral changes as your traverse the compressed
power spectrum to cause you to miss a peak that is ABOVE the detection
threshold.

The detection algorithm works on the full-resolution FFT output (several
million bins), but for purposes of having
an “overview” to look at, it needs to be compressed.

–
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium

fmalem · March 10, 2009, 5:40pm

Oh horrors. Please let it not be true that the phrase I am remembered
most
for is heuristic grass.

Your comments about MELP were insightful and now that you have said it,
more becomes crystal clear. I raise another absinthe in your honor …
or
was that dishonor.

On Mar 9, 2009 6:49pm, Frank B. [email protected] wrote:

On Mon, Mar 9, 2009 at 1:08 PM, Bob McGwier [email protected]> wrote:

So, pardon me but, is this a pretty picture exercise or a real detection
problem? If it is a detection problem, then you might as well just
compress to the largest value in the bins to be pushed together so you
assure that your threshold is exceeded when it should be. If it is a
pretty picture problem, just prevent scalloping by any old heuristic,
just make it as fleet of calculation feet as possible and throw in some
pretty grassy stuff to make it look nice.

Amen to that.

To say so is not to dismiss the level of art required by heuristic grass,
however. That’s pretty much what MELP adds, and it makes a world of
difference to the user.

fmalem · March 10, 2009, 5:47pm

On Tue, Mar 10, 2009 at 9:39 AM, [email protected] wrote:

Oh horrors. Please let it not be true that the phrase I am remembered
most

for is heuristic grass.

Isn’t heuristic grass what gives absinthe the green color?

Frank

fmalem · March 10, 2009, 12:44am

On Mon, Mar 9, 2009 at 1:08 PM, Bob McGwier [email protected] wrote:

So, pardon me but, is this a pretty picture exercise or a real detection
problem? If it is a detection problem, then you might as well just
compress to the largest value in the bins to be pushed together so you
assure that your threshold is exceeded when it should be. If it is a pretty
picture problem, just prevent scalloping by any old heuristic, just make it
as fleet of calculation feet as possible and throw in some pretty grassy
stuff to make it look nice.

Amen to that.

To say so is not to dismiss the level of art required by heuristic
grass,
however. That’s pretty much what MELP adds, and it makes a world of
difference to the user.

Frank

fmalem · March 10, 2009, 6:03pm

Frank B. wrote:

On Tue, Mar 10, 2009 at 9:39 AM, <[email protected]
mailto:[email protected]> wrote:
Oh horrors. Please let it not be true that the phrase I am
remembered most for is heuristic grass. 
Isn’t heuristic grass what gives absinthe the green color?

Just remember – absinthe makes the tart grow blonder.

John

fmalem · March 10, 2009, 5:51pm

[email protected] wrote:

Oh horrors. Please let it not be true that the phrase I am remembered
most for is heuristic grass.

I’m near-certain that you’ve said more memorable things than that,
Bob. But that’s the one that is proximate right now

–
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium

fmalem · March 10, 2009, 6:26pm

On Tue, Mar 10, 2009 at 10:02 AM, John Ackermann N8UR [email protected]
wrote:

Just remember – absinthe makes the tart grow blonder.

Sigh. With fronds like these, who needs anemones?

(Sorry – done now.)

Frank

fmalem · March 10, 2009, 6:16pm

John Ackermann N8UR wrote:

Just remember – absinthe makes the tart grow blonder.

John

You clearly know a more interesting class of tart than I do

–
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium

fmalem · March 10, 2009, 7:12pm

Frank B. wrote:

–
For an omnipotent and omniscient being, God has made some really lousy
earthly staffing decisions. – John Cole
OK, so I decided to use the averaging method, rather than the maximum
method. It produces reasonably good looking plots:

http://www.science-radio-labs.com/files/spectral_example.ps

(The GUI toolkit I’m using allows you to take a PostScript dump of the
XYPlot object).

–
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium