Forum: GNU Radio Thread scheduling for high-sensitivity, high-resolution FFTs

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Marcus D. Leech (Guest)
on 2009-01-07 01:59
(Received via mailing list)
Let's say, for the sake of interesting discussion, that I have access to
a beastly multi-CPU 'puter, perhaps based on the
  latest Xeon 7460 series CPUs, and a quad-socket server, giving me 24
CPUs, and 48 threads.

I'm interested in wideband, high-sensitivity, high-resolution
spectroscopy.

I envisage a scheme where one thread gathers an FFT-frame worth of data
(let's say from an USRP2), and doles it out to one
  of a number of other threads that are doing FFTs on these frames of
data.  All the outputs of the FFT threads are averaged
  together to produce a result that is then time averaged.

The existing FFT display schemes in Gnu Radio use a "keep-one-in-n"
scheme for the FFTs, which means that sensitivity
  is necessarily lost, due to the spectrum being under-sampled in the
time domain.

Is there existing plumbing to be able to try this sort of thing? [On a
smaller scale--I don't actually have such a monster server
  at my disposal at the moment!!].


--
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org
Eric B. (Guest)
on 2009-01-07 04:47
(Received via mailing list)
On Tue, Jan 06, 2009 at 06:58:47PM -0500, Marcus D. Leech wrote:
> data.  All the outputs of the FFT threads are averaged
>
Is this an SMP machine, or a cluster?

If SMP, it'll "just work".  If it's a cluster, you'll need to figure
out how to partition separate instantiations of GR on each node.

Eric
Marcus D. Leech (Guest)
on 2009-01-07 06:21
(Received via mailing list)
Eric B. wrote:
>
>
> Is this an SMP machine, or a cluster?
>
> If SMP, it'll "just work".  If it's a cluster, you'll need to figure
> out how to partition separate instantiations of GR on each node.
>
> Eric
>
>
SMP.

Describe how it'll "just work"??

Maybe I haven't looked at all the blocks carefully enough, but I can't
immediately see how to do this.

I basically want a "distributor" block that distributes FFT input frames
to multiple instances of an FFT computation,
  across multiple threads, and have the results of the separate FFTs
synchronized.



--
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org
Eric B. (Guest)
on 2009-01-07 07:39
(Received via mailing list)
On Tue, Jan 06, 2009 at 11:20:48PM -0500, Marcus D. Leech wrote:
> SMP.
>
> Describe how it'll "just work"??
>
> Maybe I haven't looked at all the blocks carefully enough, but I can't
> immediately see how to do this.
>
> I basically want a "distributor" block that distributes FFT input frames
> to multiple instances of an FFT computation,
>   across multiple threads, and have the results of the separate FFTs
> synchronized.

Sounds good.  Sorry,  I misunderstood your goal.
(I'm working on the guts of FFTW now and have FFT on the brain...)

I think we have the blocks you need to do the fanout and fanin
already.  They're disguised as

  gr.deinterleave(itemsize)
  gr.interleave(itemsize)

I think you could use them like this:

  fft_size = 512

  usrp = ...

  fanout = gr.deinterleave(fft_size * gr.sizeof_complex)
  fanin = gr.interleave(fft_size * gr.sizeof_complex)

  fft0 = ...
  fft1 = ...
  fft2 = ...
  fft3 = ...

  connect(usrp, fanout)

  connect((fanout, 0), fft0)
  connect((fanout, 1), fft1)
  connect((fanout, 2), fft2)
  connect((fanout, 3), fft3)

  connect(fft0, (fanin, 0))
  connect(fft1, (fanin, 1))
  connect(fft2, (fanin, 2))
  connect(fft3, (fanin, 3))


Each block runs in its own thread, so this should run faster on an SMP
machine.  Please let us know if this works for you!

[FYI, there are also two extra copies happening in the gr.fft_* block
that I
now know how to remove...]

Eric
Marcus D. Leech (Guest)
on 2009-01-07 08:05
(Received via mailing list)
Eric B. wrote:
>>>
>> synchronized.
>
>   fft1 = ...
>   connect(fft0, (fanin, 0))
>
> Eric
>
>
Yes, I came to the same conclusion about gr.interleave trickery myself,
and will have to noodle over it when I'm awake.



--
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org
Bob McGwier (Guest)
on 2009-01-07 17:23
(Received via mailing list)
From what Eric has described (which should work), you have choices.  The
choices would be to increase sensitivity by adding the results of the
fft's
which is the same as producing spectra from long term autocorrelation
which
will increase sensitivity O(n) because it is coherent averaging or if
you
sist on noncoherent averaging sensitivity will be increase O(sqrt(n))
but
only by decreasing the variance of the noise.  And/or you can build
larger
fft's from the intermediate ones via several techniques to increase
frequency resolution.  The latter requires more careful organization
because
you do not wish to window the smaller ones if you wish to combine them
later.  This would require windowing in the frequency domain and
necessarily
the expense of convolution rather than multiplication.

This is all a fruitful area for experimentation now that we have the new
TPB
scheduler.

Bob


ARRL SDR Working Group Chair
Member: ARRL, AMSAT, AMSAT-DL, TAPR, Packrats,
NJQRP, QRP ARCI, QCWA, FRC.
"And yes I said, yes I will Yes", Molly Bloom
Marcus D. Leech (Guest)
on 2009-01-07 19:05
(Received via mailing list)
Bob McGwier wrote:
> the expense of convolution rather than multiplication.
>
In the existing FFT sink, many frames are dropped, and the FFT output
bins are averaged using a single-pole
  IIR.  It takes long integration times for this to be as sensitive as
an FFT in which no input frames are ever dropped.

In my application (SETI), having long integration times hurts, because
of chirp (although one could also de-chirp
  the baseband prior to computing spectra).

I'll conduct some experiments as Eric has suggested, once I get my
quad-core system back in operation.
  [Don't muck with hardware when you're too sick to cope.  That's how
LGA775 sockets get bent
  pins :-( :-( :-( ].
> This is all a fruitful area for experimentation now that we have the new TPB
> scheduler.
>
> Bob
>
Indeed


--
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org
Bob McGwier (Guest)
on 2009-01-07 21:20
(Received via mailing list)
Marcus:

What is the time constant on your IIR?  If the time constant is longer
than
two FFT's you are probably losing sensitivity doing noncoherent
averaging.
All I am suggesting is that a computation is in order to determine the
"noise figure" of the two approaches.

Bob

ARRL SDR Working Group Chair
Member: ARRL, AMSAT, AMSAT-DL, TAPR, Packrats,
NJQRP, QRP ARCI, QCWA, FRC.
"And yes I said, yes I will Yes", Molly Bloom
Marcus D. Leech (Guest)
on 2009-01-08 11:56
(Received via mailing list)
Eric B. wrote:
>
> In your ideal world, how many bins would you like in your FFT?
>
> Eric
>
>
The application is SETI, so 10Hz or better resolution over several Mhz
of bandwidth would be ideal.


--
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org
Paul M. (Guest)
on 2009-01-14 21:13
(Received via mailing list)
I have my '900 MHz AM Receiver with AFC' running satisfactorily, but I'm
not
happy with some of the approaches I ended up using to accomplish the
AFC. I
spliced together most of fftsink2.py and usrp_am_rcvr.py and modified
both
as follows:

A. fftsink2.py
Added code to input_watcher class run method (thread):
1)find highest peak in FFT data arriving via msgq
2)calculate the frequency difference between nominal tuning freq and the
peak, i.e., tuning error or drift.
3)calculate new tune freq and retune the usrp

B. usrp_am_rcvr.py:
Adjusted filter parameters to my requirements.
Added staticmethod access to its topblock class 'set_freq' method.
Added global variables to contain information about the single instance
of
this class that is invoked by __main__.

What I don't like about the above is using staticmethod to get access to
the
set_freq method of the top block. However, I have not figured out how to
address the single instance of the topblock that exists. Considering the
code snippets below, how can I address the instance of wm_rx_block
(which I
referred to as 'top block' earlier) that gets passed to stdgui2?

A little more description of the app:
3 stdgui2 FFTs are in view: FFT of USRP data, FFT of channel filter
output,
FFT of audio
Channel filter output is supposed to have a single peak centered on 0
frequency. Any difference can be used to adjust usrp tuning freq. In
practice Thread-2 handles channel filter data and knows the freq offset.
As
usual, the usrp object is bound to a topblock, an instance of class
wam_rx_block. What I don't understand is how to address this instance.
You
might be able to tell me by simply looking at the __main__ code below.

I'm sure that you longtime Python and gnuradio folks will consider this
trivial, but I have a new bald spot. Any hints will be appreciated.
Paul M.
---------------------------------------------------------------------------

Here's __main__

if __name__ == '__main__':
    app = stdgui2.stdapp (wam_rx_block, "USRP 915MHz AM ISM RX")
    app.MainLoop ()

And here are the most important bits of wam_rx_block:

class wam_rx_block (stdgui2.std_top_block):

    my_usrp = None
    my_usrp_subdev = None

    def __init__(self,frame,panel,vbox,argv):
        global target_freq, tune_freq, my_usrp, my_usrp_subdev

  < option parsing and other standard setup stuff snipped >

        tune_freq = self.freq
        target_freq = tune_freq
        my_IF_freq = self.IF_freq    # allow class method access

        ################# Build main flowgraph ##################

        #TODO: add an AGC after the channel filter and before the
AM_demod

        self.u = usrp.source_c()                    # usrp is data
source
        my_usrp = self.u                            # allow global
access

  < more standard setup stuff snipped >

        self.u.set_mux(usrp.determine_rx_mux_value(self.u,
                                                   options.rx_subdev_spec))
        self.subdev = usrp.selected_subdev(self.u,
options.rx_subdev_spec)
        my_usrp_subdev = self.subdev

  < wx GUI setup stuff snipped >


    def set_freq(self, usrp_freq):
        """
        Set the center frequency we're interested in.

        @param target_freq: frequency in Hz
        @rypte: bool

        Tuning is a two step process.  First we ask the front-end to
        tune as close to the desired frequency as it can.  Then we use
        the result of that operation and our target_frequency to
        determine the value for the digital down converter.
        """
        r = usrp.tune(self.u, 0, self.subdev, usrp_freq + self.IF_freq)
        #TODO: check if db is inverting the spectrum or not to decide
        # if we should do + self.IF_freq  or - self.IF_freq

        return r

    def set_freq_static(usrp_freq):
        global target_freq, tune_freq, my_usrp, my_IF_freq,
my_usrp_subdev

        r = usrp.tune(my_usrp, 0, my_usrp_subdev, usrp_freq +
my_IF_freq)

    set_freq_static = staticmethod(set_freq_static)

----------------------------------------------------------------------------

And, here's the modified run method for input_watcher in fftsink2, with
the
call to set_freq_static near the end:

    def run (self):
        global target_freq, tune_freq

        while (self.keep_running):
            msg = self.msgq.delete_head()  # blocking read of message
queue
            # FFT results are prefixed by vector size and number of
frames.
            itemsize = int(msg.arg1())
            nitems = int(msg.arg2())

            s = msg.to_string()            # get the body of the msg as
a
string

            # There may be more than one FFT frame in the message.
            # If so, we take only the last one
            if nitems > 1:
                start = itemsize * (nitems - 1)
                s = s[start:start + itemsize]

            # String actually contains f.p. numbers, so convert.
            complex_data = fromstring (s, float32)

            # find location and amplitude of highest peak
            self.maxval = max(complex_data)
            self.pk_bin = argmax(complex_data)
            self.avgval = mean(complex_data)

            self.cycles_per_bin = self.sample_rate / self.fft_size

            # Show size of packet, max value, index of max, and mean
value.
            # decide if peak is significant
            # FIXME: make threshold a parameter and/or dynamic
            if ((self.maxval - self.avgval) >= 8):
                # 1st vlen/2 bins get peak when center_freq is low,
                # with offset maximum = 1/2 of BW at bin vlen/2
                # 2nd vlen/2 bins get peak when center_freq is high,
                # with offset maximum = 1/2 of BW at bin vlen/2
                # Both halves repeat for freq diffs > FFT width / 2.
                # For usrp_rate = 64 MHz / 256 = 250 kHz, do bin
(vlen/2)
                # represents f = -125 kHz, +125 kHz,...
                # If we restrict the range of center_freq to within BW/2
                # of carrier, there can be no ambiguity.
                # Given a peak located at bin N, solve for frequency:
                #      cycles_per_bin = usrp_rate/vlen,
                # This can be refined by calculating a mean pk_freq
value
                # based on several bins,
                # starting with the values on either side of the peak:
                if self.pk_bin > 0:
                    left_bin = self.pk_bin - 1
                else:
                    left_bin = (self.fft_size - 1)
                if self.pk_bin < (self.fft_size - 2):
                    right_bin = self.pk_bin + 1
                else:
                    right_bin = 0
                left_val = complex_data[left_bin]
                right_val = complex_data[right_bin]
                # Calculate a weighted average bin location starting
with
                # the fraction of a bin to offset the center.
                wtd_sum = (right_val - left_val) / (left_val +
right_val)
                self.pk_bin += wtd_sum
                # Adjust for bottom wraparound
                if (self.pk_bin < 0): self.pk_bin += (nitems - 1)
                # Adjust for top wraparound
                if (self.pk_bin > (self.fft_size - 1)):
                    self.pk_bin += (1 - self.fft_size)
                # Calculate equivalent freq of bin, accounting for
spectral
                # folding around center.
                if self.pk_bin < self.fft_size / 2:
                    self.pk_freq = int(self.center_freq +
                                       (self.pk_bin *
self.cycles_per_bin))
                else:
                    self.pk_freq = int(self.center_freq +
                                       ((self.pk_bin - (self.fft_size -
1))
*
                                        self.cycles_per_bin))

                ################### RETUNING OCCURS HERE:
####################
                # NOTE: For the passband data, pk_freq represents freq
drift.
                if self.afc:
                    if abs(self.pk_freq) > 200:
                        target_freq = tune_freq + self.pk_freq/3     #
correct for drift
                        print "retune:", self.pk_freq, "to", target_freq
                        wam_rx_block.set_freq_static(target_freq)

################################################################

            # package the data as a wx event object
            de = DataEvent (complex_data)
            # package this packet as a data event within wx and
            # push this single fft plot data packet into fftsink msgq
            wx.PostEvent (self.event_receiver, de)
            del de # no further need for this copy of data
Paul M. (Guest)
on 2009-02-11 21:53
(Received via mailing list)
Here's a screen shot of FFT output:

http://i394.photobucket.com/albums/pp28/optoeng/2peaks.jpg

The image is typical, in that there are multiple narrow local maxima,
but
the amplitude of these particular peaks is quite a bit higher than I
expect
'in the field'. I'm trying to work out an efficient way to compute the
locations and relative amplitudes of these maxima. Can someone please
suggest either an algorithm and/or the most similar gnuradio application
that I can examine as candidates for this analyis? Thanks.
Paul M.
John C. (Guest)
on 2009-02-12 00:36
(Received via mailing list)
Paul M. schrieb:
> Paul M.
I thought Analog TV was going of the air in the US... The difficulty is
that when the FFT is taken
from the real world, and the signal tends to the noise level, most
'peak' detection algorithms fail,
well, have lots of noise in the 'accept' condition...

So, long term averaging, and 'knowing' that a peak would appear at
offset x from the right or
left side of the spectrum will be more likely to succeed.

John C..
Paul M. (Guest)
on 2009-02-12 03:40
(Received via mailing list)
Signal has nothing to do with TV.

Paul M. schrieb:
> analyis? Thanks. Paul M.
I thought Analog TV was going of the air in the US... The difficulty is
that when the FFT is taken
from the real world, and the signal tends to the noise level, most
'peak' detection algorithms fail,
well, have lots of noise in the 'accept' condition...

So, long term averaging, and 'knowing' that a peak would appear at
offset x from the right or
left side of the spectrum will be more likely to succeed.

John C..
Paul M. (Guest)
on 2009-05-19 02:47
(Received via mailing list)
About a month ago, our USRP1 Rev. 4.5 (Ser. #2728) became erratic in it
USB
connection and communications. A short while after, USB failed
altogether. I
put scope probes on the USB + and - data lines and observed that one of
the
signals had a much different sort of signal than the other. Following
advice
from Matt E. given to others who had experienced USB failure, I
ordered a
replacement USB chip along with a Chipquik SMD removal kit. Replacement
of
the chip had no effect, so I was forced to look more carefully for a
cause.
It turns out that one of the 0603 capacitors installed near the USB
receptacle (C73, C74) was shorted, and removal of these 2 capacitors
restored USB function. I'm posting this description of my experiences
for
the benefit of others on the list.

I suspect that C73 and C74 (which I can't find in various schematics and
BOMs for USRP1) were added in the hope that they would provide a path
for
electrostatic discharges, which is ironic. In my experience,
low-capacitance
avalanche breakdown devices are more suitable for this purpose.

I can highly recommend ChipQuik.

Paul M.
Matt E. (Guest)
on 2009-05-19 03:01
(Received via mailing list)
Paul M. wrote:
> the benefit of others on the list.
>
> I suspect that C73 and C74 (which I can't find in various schematics and
> BOMs for USRP1) were added in the hope that they would provide a path for
> electrostatic discharges, which is ironic. In my experience, low-capacitance
> avalanche breakdown devices are more suitable for this purpose.


Paul,

Thanks for your analysis of the situation.  C73 and C74 are not actually
capacitors, they are, as you say, electrostatic discharge protectors,
part number PESD0402-060.  I have not seen any fail in the manor you
mention, but I suppose enough static could cause that.

Has anyone else seen this failure mode?

Matt
Paul M. (Guest)
on 2009-05-19 03:24
(Received via mailing list)
Thanks for the clarification about C73, C74. The Raychem datasheet
doesn't
say so directly, I'm guessing that these parts are some form of
varistor. In
my experience, varistors are less reliable than silicon clamping
devices,
both in terms of their dynamic clamping characteristics and their
ultimate
failure characteristics.

http://www.conformity.com/artman/publish/printer_211.shtml

http://www.nxp.com/acrobat_download/applicationnot...

I work in a high humidity environment, and I'd be surprised if ESD was
the
cause of failure of these parts, anyway. I suspect that flexing of the
circuit board during USB plugging/unplugging caused cracking. The large
solder fillets on the parts may have contributed to the problem.

Paul M.
Marcus D. Leech (Guest)
on 2009-05-19 03:28
(Received via mailing list)
Matt E. wrote:
>
> Matt
>
>
Aren't ESD protectors usually MOVs, and don't MOVs usually fail by
shorting?  They can absorb so many joules of ESD, and
  after that they short out.

In live electrical power lines,  this usually causes the breaker to
trip.

The other type are GDTs, which fail open.  But they don't fail very
often.

--
Marcus L.
Principal Investigator, Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org
This topic is locked and can not be replied to.