After the make test failed for this module, I decided to poke around to
see
if there is an easy fix. I made a script that simply executes the test
over
and over until it seg faults and exits after the core file is created.
[email protected]:~/src/gnuradio/build/gr-digital/python/digital$ ./runtests.sh
Using Volk machine: avx_64_mmx
Segmentation fault (core dumped)
[email protected]:~/src/gnuradio/build/gr-digital/python/digital$ gdb
/usr/bin/python2.7 core
(gdb) bt
(gdb) bt
#0 0x00007fe8f627fb17 in volk_32fc_32f_dot_prod_32fc_a_avx ()
from /home/kelly/src/gnuradio/build/volk/lib/libvolk.so.0.0.0
#1 0x00007fe8f52dd25f in
gr::filter::kernel::fir_filter_ccf::filter(std::complex const*)
()
from
/home/kelly/src/gnuradio/build/gr-filter/lib/libgnuradio-filter-3.8git.so.0.0.0
#2 0x00007fe8f143c45b in
gr::digital::pfb_clock_sync_ccf_impl::general_work(int, std::vector<int,
std::allocator >&, std::vector<void const*, std::allocator<void
const*> >&, std::vector<void*, std::allocator<void*> >&) ()
from
/home/kelly/src/gnuradio/build/gr-digital/lib/libgnuradio-digital-3.8git.so.0.0.0
#3 0x00007fe8f653809e in gr::block_executor::run_one_iteration() ()
from
/home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
#4 0x00007fe8f6573622 in
gr::tpb_thread_body::tpb_thread_body(boost::shared_ptrgr::block, int)
()
from
/home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
#5 0x00007fe8f6565ea1 in
boost::detail::function::void_function_obj_invoker0<gr::thread_body_wrappergr::tpb_container,
void>::invoke(boost::detail::function::function_buffer&) ()
from
/home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
—Type to continue, or q to quit—
#6 0x00007fe8f6526610 in
boost::detail::thread_data<boost::function0
::run() ()
from
/home/kelly/src/gnuradio/build/gnuradio-runtime/lib/libgnuradio-runtime-3.8git.so.0.0.0
#7 0x00007fe8f9adc94a in ?? ()
from /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.53.0
#8 0x00007fe8fc8a3f6e in start_thread (arg=0x7fe8e2ffd700)
at pthread_create.c:311
#9 0x00007fe8fc5ce9cd in clone ()
at …/sysdeps/unix/sysv/linux/x86_64/clone.S:113
Of course, I had to recompile it with debugging info to glean anything
useful from the stack trace. So, I did that and I traced the bug to
this
line:
c0Val = _mm256_mul_ps(a0Val, b0Val);
I can’t dump the values in a0Val or b0Val, though, because they’re
intermediate values that are optimized away by the optimized kernel
code.
I tried stepping through the assembler instructions but I’m not familiar
with the various sse and avx extensions. Heck, I’m not even familiar
with
the x86_64 instruction set. So I have a huge learning curve ahead of
me,
there. Is it possible to just dump the values in these __m256 data
types
to a file so I can debug it that way? If that’s not easy to do, then
I’m
willing to learn what I have to about the instruction set so I can debug
this thing. But I would sure appreciate some help if anyone has some
advice to offer.
Software version:
I rebased to the latest version of the next branch last night before I
went
to bed at around 1:30 am CDT.
Operating System:
[email protected]:~/src/gnuradio/volk/kernels/volk$ uname -a
Linux octs2 3.11.0-17-generic #31-Ubuntu SMP Mon Feb 3 21:52:43 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux
It’s Ubuntu 13.10
Hardware: ASUS X750J
Intel Quad Core i7 4700HQ 2.4GHz
cpuinfo:
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel® Core™ i7-4700HQ CPU @ 2.40GHz
stepping : 3
microcode : 0x8
cpu MHz : 2401.000
cache size : 6144 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic
movbe
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat
epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
fsgsbase
tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips : 4789.27
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management: