I thought we had squashed this one

aris · June 6, 2012, 5:48am

Got a core dump this evening from this evenings GIT build. On a Fedora
12 machine (I know, horbly obsolete),
on a Centrino M CPU:

#0 0x003bb409 in volk_32fc_x2_multiply_32fc_a_sse3 ()
from /usr/local/lib/libvolk.so.0.0.0
#1 0x00396415 in get_volk_32fc_x2_multiply_32fc_a ()
from /usr/local/lib/libvolk.so.0.0.0
#2 0x0096b7e7 in gri_fft_filter_ccc_generic::filter(int,
std::complex const*, std::complex) ()
from /usr/local/lib/libgnuradio-core-3.6.1git.so.0.0.0
#3 0x00972a5b in gr_fft_filter_ccc::work(int, std::vector<void const,
std::allocator<void const*> >&, std::vector<void*, std::allocator<void*>

&) ()
from /usr/local/lib/libgnuradio-core-3.6.1git.so.0.0.0
#4 0x00942187 in gr_sync_decimator::general_work(int, std::vector<int,
std::allocator >&, std::vector<void const*, std::allocator<void
const*> >&, std::vector<void*, std::allocator<void*> >&) ()
from /usr/local/lib/libgnuradio-core-3.6.1git.so.0.0.0
#5 0x00929455 in gr_block_executor::run_one_iteration() ()
from /usr/local/lib/libgnuradio-core-3.6.1git.so.0.0.0
#6 0x00944bb3 in
gr_tpb_thread_body::gr_tpb_thread_body(boost::shared_ptr<gr_block>, int)
() from /usr/local/lib/libgnuradio-core-3.6.1git.so.0.0.0
#7 0x0093eecc in
boost::detail::function::void_function_obj_invoker0<gruel::thread_body_wrapper<tpb_container>,
void>::invoke(boost::detail::function::function_buffer&) () from
/usr/local/lib/libgnuradio-core-3.6.1git.so.0.0.0
#8 0x0031084c in boost::function0::operator()() const ()
from /usr/local/lib/libgruel-3.6.1git.so.0.0.0

Here’s /proc/cpuinfo

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 14
model name : Genuine Intel(R) CPU T2400 @ 1.83GHz
stepping : 8
cpu MHz : 1833.000
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc
arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm
bogomips : 3657.43
clflush size : 64
cache_alignment : 64
address sizes : 32 bits physical, 32 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 14
model name : Genuine Intel(R) CPU T2400 @ 1.83GHz
stepping : 8
cpu MHz : 1833.000
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc
arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm
bogomips : 3657.52
clflush size : 64
cache_alignment : 64
address sizes : 32 bits physical, 32 bits virtual
power management:

Seems to have been provoked by replacing a regular FIR filter with an
FFT filter (or rather, replacing
3 FIR filters with 3 FFT filters with the same “shape” as the FIR
filters).

Curiously enough, on Ubuntu 12.04 on exactly the same hardware, I
don’t have this problem, even
with this evening’s build.

–
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · June 7, 2012, 3:47am

On Tue, Jun 5, 2012 at 11:45 PM, Marcus D. Leech [email protected]
wrote:

Got a core dump this evening from this evenings GIT build. On a Fedora
12 machine (I know, horbly obsolete),
on a Centrino M CPU:

We did squash this bug. Why you gotta keep bringing up the past?

What version of FFTW is on that machine? The block should be calling
into fftw_malloc to get aligned arrays for the taps and the in/out
buffers are from the fftw object, anyways, so they are aligned. So
there’s nothing that should be producing anything unaligned unless
FFTW isn’t working like I think it should be working.

Tom

Marcus_DSLeech · June 7, 2012, 4:13am

On Wed, Jun 6, 2012 at 9:52 PM, Marcus D. Leech [email protected]
wrote:

into fftw_malloc to get aligned arrays for the taps and the in/out
It’s a 32-bit machine–if that makes a difference on the alignment.
Ok, I’m not entirely sure why I asked that since I don’t know enough
about FFTW by it’s version to make any judgement off it. And 3.2.2
isn’t that old.

We’d really have to dig deeper. See what the addresses of a, b, and c
in gri_fft_filter_ccc_generic::filter actually are to see which one is
returning something unaligned.

Tom

Marcus_DSLeech · June 7, 2012, 3:53am

into fftw_malloc to get aligned arrays for the taps and the in/out
buffers are from the fftw object, anyways, so they are aligned. So
there’s nothing that should be producing anything unaligned unless
FFTW isn’t working like I think it should be working.

Tom

FFTW is: fftw-3.2.2-1.fc12.i686

It’s a 32-bit machine–if that makes a difference on the alignment.

std::allocator<void const*> >&, std::vector<void*, std::allocator<void*>
#6 0x00944bb3 in

core id : 0
wp : yes
vendor_id : GenuineIntel
apicid : 1
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc
3 FIR filters with 3 FFT filters with the same “shape” as the FIR
Shirleys Bay Radio Astronomy Consortium

–
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · June 7, 2012, 4:35am

Turns out that neither ‘a’ nor ‘c’ are 16-byte aligned, and they both
come from FFTWs allocator.

Weird.

And the same thing applies to the _fff variant as well.

–
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · June 7, 2012, 4:55am

On Wed, Jun 6, 2012 at 10:34 PM, Marcus D. Leech [email protected]
wrote:

Turns out that neither ‘a’ nor ‘c’ are 16-byte aligned, and they both
come from FFTWs allocator.

Weird.

And the same thing applies to the _fff variant as well.

Arg… ok, so that means that FFTW isn’t trying to use any
vectorization on your machine. I wonder if it’s something done at
compile time or if that version just isn’t handling your Centrino
properly.

I have the very smallest start of an idea for how to handle this, but
it’s late here and I’ve been working all day, so any idea sounds like
a good idea right now. Let me see what I can come up with tomorrow.

Tom

Marcus_DSLeech · June 7, 2012, 4:16am

Tom

Ok, I’ll check on them.

–
Principal Investigator
Shirleys Bay Radio Astronomy Consortium