Correlation Estimator Bugs

musicdenotation · July 1, 2015, 6:27am

Hi all,

I was just looking through the code in corr_est_cc_impl.cc because it is
behaving unexpectedly and I think I have spotted a few bugs.

If you look at line 203 inside _set_threshold and line 240 inside
work,
the correlation output is being squared in both cases. The output of the
correlation operation is already proportional to the voltage squared, so
squaring it again makes it proportional to the fourth of voltage. These
extra squares are likely a bug. I don’t think this has a big effect on
function, but it’s not optimal and is extra calculations nonetheless. It
is
also the case that the phase_est and time_est values are being
calculated
on mag^4 data instead of mag^2 data. Because they are both monotonically
increasing, again there is probably not a functional effect, but once
again
not optimal.
The peak value of the correlation peak is being reported as twice as
big
as they should be able to get, given a known input length with no noise
added. For example, given a length 10 sequence which we correlate
against,
when they align perfect, you expect to see a peak value of 100 (1 or -1
times itself summed 10 times = 10 and then squared because of the
current
code implementation is 100). Yet, the peak values reported by the
corr_start tag value are 200, which shouldn’t be achievable. I can’t
figure
out where this factor of two is being introduced in the code.
This may be intended, but doesn’t feel right. When you look at the
correlation output port (port 1), the values of the y-axis seem to have
been square rooted, so that they appear to be in the range you would
expect, though the corr_start tag shows the squared peak value that is
proportional to the 4th of voltage. It’s confusing and the fact that
these
two values are being sent to the user makes me think this is
unintentional.
I have noticed that when I start my receiver first, with a power
squelch
gate up front, and then start my transmitter, the correlator places
thousands of tags on the initial output of the squelch. While the input
data is very low, the values of the mag^2 being reported in the
corr_start
tag value are very high, well above threshold. I don’t know how this
could
happen. It’s causing thousands of false tags to be placed in the stream
that should not be there. I have not figured out anything concrete about
this bug. It’s possible it could be my setup, but I don’t see how yet.
This
is an over the air test.

I’m hoping the original block designers can provide some feedback that
might help me understand what the intentions were.

Thanks,
Rich

reback · July 1, 2015, 4:03pm

Hi Richard,

I’m not the original author, but hopefully I can enlighten little

I was just looking through the code in corr_est_cc_impl.cc because it
is behaving unexpectedly and I think I have spotted a few bugs.

If you look at line 203 inside _set_threshold and line 240 inside
work, the correlation output is being squared in both cases. The
output of the correlation operation is already proportional to the
voltage squared, so squaring it again makes it proportional to the
fourth of voltage. These extra squares are likely a bug. I don’t think
this has a big effect on function, but it’s not optimal and is extra
calculations nonetheless. It is also the case that the phase_est and
time_est values are being calculated on mag^4 data instead of mag^2
data. Because they are both monotonically increasing, again there is
probably not a functional effect, but once again not optimal.

Most of this is intentional and not a bug.

Consider:

a. the complex auto-correlation is guaranteed to be real and positive at
its peak at lag time t=0; although not necessarily real and positive at
any other lag

b. the complex cross-correlation, of a reference signal and an
incoming noisy similar signal, is never guaranteed to to be real valued
and positive at any lag value. So comparison of cross-correlation
values with an auto-correlation peak requires comparison of complex
number magnitudes, since only one of two complex numbers we know to be
actually a real, positive number.

c. The sqrt() function has a non-zero computational cost. We wish to
avoid computing n Msqrts/second if we can, to save CPU for other things.

d. Comparing the magnitude squared of complex numbers for “<”, “==”, or
“>” is equivalent to comparing the magnitude of complex numbers and we
avoid taking millions of square roots per second.

So no, the squaring is not extra and not a bug, it’s intentional. It’s
not suboptimal, as finding the complex magnitude is a squaring (i.e.
z*z_conj) followed by a square root, which is more expensive. The code
is misleading in that has variables named d_mag[] as opposed to d_mag2[]
for example.

I would encourage you to think of the cross-correlation as a joint
property of two signals, and not some sort of power level. That way you
can talk about things like “magnitude of the correlation peak” and leave
the signal’s linear amplitude out of picture if it confuses things.

So there is a bug in the GUI and input arguments. If the user want a
90% threshold, for example, the user needs to enter 0.9*0.9 or 0.81 in
the GUI. This is minor and easily worked-around and also can be easily
fixed. (patch?)

The peak value of the correlation peak is being reported as twice
as big as they should be able to get, given a known input length with
no noise added. For example, given a length 10 sequence which we
correlate against, when they align perfect, you expect to see a peak
value of 100 (1 or -1 times itself summed 10 times = 10 and then
squared because of the current code implementation is 100). Yet, the
peak values reported by the corr_start tag value are 200, which
shouldn’t be achievable. I can’t figure out where this factor of two
is being introduced in the code.

Hmm. Without seeing some matlab or octave code, I can’t really know for
sure about the factor of two above what exactly.

After looking at this comment I threw in:

github.com

gnuradio/gnuradio/blob/master/gr-digital/lib/corr_est_cc_impl.cc#L280


      
                    double nom = d_corr_mag[i-1]-d_corr_mag[i+1];
                    double denom = 2*(d_corr_mag[i-1]-2*d_corr_mag[i]+d_corr_mag[i+1]);
                    center = nom/denom;
                  }
          #else
                  // Calculates the center of mass between the three points around the peak.
                  // Estimate is linear.
                  double nom = 0, den = 0;
                  nom = d_corr_mag[i - 1] + 2 * d_corr_mag[i] + 3 * d_corr_mag[i + 1];
                  den = d_corr_mag[i - 1] + d_corr_mag[i] + d_corr_mag[i + 1];
                  double center = nom / den;
                  center = (center - 2.0); // adjust for bias in center of mass calculation
          #endif
          
          
        // Estimated scaling factor for the input stream to normalize
                  // the output to +/-1.
                  uint32_t maxi;
                  volk_32fc_index_max_32u_manual(&maxi, (gr_complex*)in, noutput_items, "generic");
                  d_scale = 1 / std::abs(in[maxi]);
          
          
        // Calculate the phase offset of the incoming signal.

the 2*A factor out front may hold your answer. I encourage you to look
up the paper whence that expression came:

github.com

gnuradio/gnuradio/blob/master/gr-digital/include/gnuradio/digital/corr_est_cc.h#L78


      
          * Marple, Jr., S. L., "Estimating Group Delay and Phase Delay
          * via Discrete-Time 'Analytic' Cross-Correlation, _IEEE_Transcations_
          * _on_Signal_Processing_, Volume 47, No. 9, September 1999
          *
          */
          typedef enum {
             THRESHOLD_DYNAMIC,
             THRESHOLD_ABSOLUTE,
          } tm_type;
          
          
class DIGITAL_API corr_est_cc : virtual public sync_block
          {
          public:
             typedef std::shared_ptr<corr_est_cc> sptr;
          
          
   /*!
              * Make a block that correlates against the \p symbols vector
              * and outputs a phase and symbol timing estimate.
              *
              * \param symbols           Set of symbols to correlate against (e.g., a
              *                          sync word).

It may be specific to the modulated band-limited pulses used in that
paper, but maybe it’s a general expression for complex
cross-correlation.

This may be intended, but doesn’t feel right. When you look at the
correlation output port (port 1), the values of the y-axis seem to
have been square rooted, so that they appear to be in the range you
would expect, though the corr_start tag shows the squared peak value
that is proportional to the 4th of voltage. It’s confusing and the
fact that these two values are being sent to the user makes me think
this is unintentional.

You are correct, that output is the complex cross-correlation without
being ‘squared’ (i.e. z*z_conj). It is intentional. You can plot the
real part and magnitude of the correlation output and grossly eyeball
the phase at the correlation peak: if the they line up - the two input
signals are in-phase; if they are flipped in sign - the two input
signals are 180 degrees out of phase; if the one is 0 - the two input
signals are +/-90 degrees out of phase; and everything in between.

The correlation value on the output tag being squared, yeah that’s
probably a bug. But it was like that when I got here. That could be
easily worked-around or fixed as well.

I have noticed that when I start my receiver first, with a power
squelch gate up front, and then start my transmitter, the correlator
places thousands of tags on the initial output of the squelch. While
the input data is very low, the values of the mag^2 being reported in
the corr_start tag value are very high, well above threshold. I don’t
know how this could happen. It’s causing thousands of false tags to be
placed in the stream that should not be there. I have not figured out
anything concrete about this bug. It’s possible it could be my setup,
but I don’t see how yet. This is an over the air test.

Without a flowgraph and maybe a data file, I’m at a loss on this one.

The tags should never be closer than the length of your preamble in
samples though.

Is the squelch block before or after the corr_est block?

I’m hoping the original block designers can provide some feedback that
might help me understand what the intentions were.

Regards,
Andy