_______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
on 2013-02-11 22:20
on 2013-02-12 20:13
On Mon, Feb 11, 2013 at 1:19 PM, Tommy Tracy II <tjt7a@virginia.edu> wrote: > I tried running volk_profile in Gentoo and got the following: > > # volk_profile > Using Volk machine: sse4_2_32_orc > RUN_VOLK_TESTS: volk_32fc_s32fc_rotatorpuppet_32fc_a > *generic completed in 361.04s* > sse4_1 completed in 0.49s > RUN_VOLK_TESTS: volk_32fc_32f_multiply_32fc_a > sse completed in 0.37s > Segmentation fault > I'm not sure what to make of the first one, but the second is likely caused by trying to execute a SIMD instruction on a CPU that doesn't actually have it. This has happened before when the VOLK detection routines had a bug, or when a VM "lies" about being able to virtualize the SIMD instruction set in the cpuid. If you could run CMake again, save the output to a file, then grep the two lines below: -- Available architectures: generic;64;3dnow;abm;popcount;mmx;sse;sse2;orc;norc;sse3;ssse3;sse4_a;sse4_1;sse4_2;avx -- Available machines: generic_orc;sse2_64_mmx_orc;sse3_64_orc;ssse3_64_orc;sse4_a_64_orc;sse4_1_64_orc;sse4_2_64_orc;avx_64_mmx_orc ...you'll see what VOLK came up with (the above is from the machine I am typing on now.) You can compare this to the capabilities reported by /proc/cpuinfo to see if there is a difference. Johnathan
on 2013-02-13 17:27
Thank you. I think this may be the problem:
# grep "Available architectures" cmake.out
-- Available architectures:
generic;32;3dnow;abm;popcount;mmx;sse;sse2;orc;norc;sse3;ssse3;sse4_a;sse4_1;sse4_2;avx
# grep "Available machines" cmake.out
-- Available machines:
generic_orc;sse2_32_mmx_orc;sse3_32_orc;ssse3_32_orc;sse4_a_32_orc;sse4_1_32_orc;sse4_2_32_orc;avx_32_mmx_orc
/proc/cpuinfo flags:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm arat dts
tpr_shadow vnmi flexpriority ept void
It looks like my processor does not support avx, but Gnuradio assumes it
does. Is there a way to disable avx?
Sincerely,
Tommy James Tracy II
PhD Student
High Performance Low Power Lab
University of Virginia
on 2013-02-13 20:46
On Wed, Feb 13, 2013 at 8:25 AM, Tommy Tracy II <tjt7a@virginia.edu> wrote: > pse36 clflush dts acpi *mmx* fxsr *sse* *sse2* ss ht tm pbe syscall nx > rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology > nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 > *ssse3 *cx16 xtpr pdcm *sse4_1* *sse4_2* popcnt aes lahf_lm arat dts > tpr_shadow vnmi flexpriority ept void > > > It looks like my processor does not support avx, but Gnuradio assumes it > does. Is there a way to disable avx? > It would be best to find out why libvolk is detecting avx during cmake. Can you post the rest of the lines from your cmake.out related to volk (should be near the beginning)? Johnathan
on 2013-02-13 20:49
On 02/13/2013 01:44 PM, Johnathan Corgan wrote: >> >> does. Is there a way to disable avx? >> > > It would be best to find out why libvolk is detecting avx during cmake. It determines that AVX is supported by the compiler via flags. So, support for AVX will be built into the library. From the output of the volk profile, it doesnt seem that AVX was detected. So, all seems well so far... -josh
on 2013-02-13 21:01
The problem is that during the build, AVX support was enabled, even
though my processor doesn't support it.
-- Python checking for Cheetah >= 2.0.0
-- Python checking for Cheetah >= 2.0.0 - found
-- Compiler name: GNU
-- x86* CPU detected
-- CPU width is 32 bits, Overruled arch 64
-- Available architectures:
generic;32;3dnow;abm;popcount;mmx;sse;sse2;orc;norc;sse3;ssse3;sse4_a;sse4_1;sse4_2;avx
-- Available machines:
generic_orc;sse2_32_mmx_orc;sse3_32_orc;ssse3_32_orc;sse4_a_32_orc;sse4_1_32_orc;sse4_2_32_orc;avx_32_mmx_orc
Sincerely,
Tommy James Tracy II
PhD Student
High Performance Low Power Lab
University of Virginia
on 2013-02-13 21:22
On 02/13/2013 01:59 PM, Tommy Tracy II wrote:
> The problem is that during the build, AVX support was enabled, even though my
processor doesn't support it.
Thats intended because AVX is supported by the compiler. You should
notice however, that VOLK detected at runtime that AVX was not actually
available on your CPU.
Back to the original issue, I thought there was a segfault in one of the
profile tests. I think a gdb backtrace would be helpful to see which one
is failing.
-josh
on 2013-02-13 21:47
What I got from GDB:
Is there any way to get more information using backtrace?
Program received signal SIGSEGV, Segmentation fault.
0xf7f77098 in volk_32fc_32f_multiply_32fc_a_generic () from
/usr/lib/libvolk.so.0.0.0
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
Sincerely,
Tommy James Tracy II
PhD Student
High Performance Low Power Lab
University of Virginia
on 2013-02-14 14:20
Nick, thank you. I was wondering why AVX showed up in the list if the
processor didn't support it.
Does anyone have any ideas why rotatorpupper would take so long?
Sincerely,
Tommy James Tracy II
PhD Student
High Performance Low Power Lab
University of Virginia
on 2013-02-14 14:52
On 02/13/2013 12:17 PM, Tommy Tracy II wrote: > Nick, thank you. I was wondering why AVX showed up in the list if the > processor didn't support it. > Does anyone have any ideas why rotatorpupper would take so long? I don't, but I'm not that worried about it as the generic implementation is only there for backup when the hardware doesn't support the more effective SIMD version. Generic implementation times can vary hugely dependent on which version of GCC you're using, what optimization flags were enabled, etc. And sometimes GCC just optimizes really, really terribly. The segfault is a different story. Like Josh suggests a backtrace would be helpful to see exactly what went wrong. --n
on 2013-02-14 16:30
On Wed, Feb 13, 2013 at 12:49 PM, Nick Foster <nick@ettus.com> wrote: > The segfault is a different story. Like Josh suggests a backtrace would be > helpful to see exactly what went wrong. > The generic implementation of the rotator function normally takes 5-6 seconds on a typical machine, so the 341 seconds is quite the outlier, and I'd suspect there is something else going on to cause that besides poor compiler optimization. Regarding the segfault, yes, compiling GNU Radio with debug symbols turned on and doing a backtrace under gdb would provide us more info. This can be done by re-running CMake with -DCMAKE_BUILD_TYPE=DEBUG and recompiling. Johnathan
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.