Forum: GNU Radio Issues with running volk_profile in GENTOO

Posted by Tommy Tracy II (Guest)
on 2013-02-11 22:20
Attachment: signature.asc (494 Bytes)
(Received via mailing list)
_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
Posted by Johnathan Corgan (Guest)
on 2013-02-12 20:13
(Received via mailing list)
On Mon, Feb 11, 2013 at 1:19 PM, Tommy Tracy II <tjt7a@virginia.edu> 
wrote:


> I tried running volk_profile in Gentoo and got the following:
>
> # volk_profile
> Using Volk machine: sse4_2_32_orc
> RUN_VOLK_TESTS: volk_32fc_s32fc_rotatorpuppet_32fc_a
> *generic completed in 361.04s*
> sse4_1 completed in 0.49s
>

RUN_VOLK_TESTS: volk_32fc_32f_multiply_32fc_a
> sse completed in 0.37s
> Segmentation fault
>

I'm not sure what to make of the first one, but the second is likely 
caused
by trying to execute a SIMD instruction on a CPU that doesn't actually 
have
it.  This has happened before when the VOLK detection routines had a 
bug,
or when a VM "lies" about being able to virtualize the SIMD instruction 
set
in the cpuid.

If you could run CMake again, save the output to a file, then grep the 
two
lines below:

-- Available architectures:
generic;64;3dnow;abm;popcount;mmx;sse;sse2;orc;norc;sse3;ssse3;sse4_a;sse4_1;sse4_2;avx
-- Available machines:
generic_orc;sse2_64_mmx_orc;sse3_64_orc;ssse3_64_orc;sse4_a_64_orc;sse4_1_64_orc;sse4_2_64_orc;avx_64_mmx_orc

...you'll see what VOLK came up with (the above is from the machine I am
typing on now.)  You can compare this to the capabilities reported by
/proc/cpuinfo to see if there is a difference.

Johnathan
Posted by Tommy Tracy II (Guest)
on 2013-02-13 17:27
(Received via mailing list)
Thank you. I think this may be the problem:

# grep "Available architectures" cmake.out
-- Available architectures: 
generic;32;3dnow;abm;popcount;mmx;sse;sse2;orc;norc;sse3;ssse3;sse4_a;sse4_1;sse4_2;avx

# grep "Available machines" cmake.out
-- Available machines: 
generic_orc;sse2_32_mmx_orc;sse3_32_orc;ssse3_32_orc;sse4_a_32_orc;sse4_1_32_orc;sse4_2_32_orc;avx_32_mmx_orc

/proc/cpuinfo flags:
flags    : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx 
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est 
tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm arat dts 
tpr_shadow vnmi flexpriority ept void


It looks like my processor does not support avx, but Gnuradio assumes it 
does. Is there a way to disable avx?

                   Sincerely,
          Tommy James Tracy II
            PhD Student
High Performance Low Power Lab
           University of Virginia
Posted by Johnathan Corgan (Guest)
on 2013-02-13 20:46
(Received via mailing list)
On Wed, Feb 13, 2013 at 8:25 AM, Tommy Tracy II <tjt7a@virginia.edu> 
wrote:


> pse36 clflush dts acpi *mmx* fxsr *sse* *sse2* ss ht tm pbe syscall nx
> rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
> nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
> *ssse3 *cx16 xtpr pdcm *sse4_1* *sse4_2* popcnt aes lahf_lm arat dts
> tpr_shadow vnmi flexpriority ept void
>
>
> It looks like my processor does not support avx, but Gnuradio assumes it
> does. Is there a way to disable avx?
>

It would be best to find out why libvolk is detecting avx during cmake.
Can you post the rest of the lines from your cmake.out related to volk
(should be near the beginning)?

Johnathan
Posted by Josh Blum (Guest)
on 2013-02-13 20:49
(Received via mailing list)
On 02/13/2013 01:44 PM, Johnathan Corgan wrote:
>>
>> does. Is there a way to disable avx?
>>
>
> It would be best to find out why libvolk is detecting avx during cmake.

It determines that AVX is supported by the compiler via flags. So,
support for AVX will be built into the library.

From the output of the volk profile, it doesnt seem that AVX was
detected. So, all seems well so far...

-josh
Posted by Tommy Tracy II (Guest)
on 2013-02-13 21:01
(Received via mailing list)
The problem is that during the build, AVX support was enabled, even 
though my processor doesn't support it.

-- Python checking for Cheetah >= 2.0.0
-- Python checking for Cheetah >= 2.0.0 - found
-- Compiler name: GNU
-- x86* CPU detected
-- CPU width is 32 bits, Overruled arch 64
-- Available architectures: 
generic;32;3dnow;abm;popcount;mmx;sse;sse2;orc;norc;sse3;ssse3;sse4_a;sse4_1;sse4_2;avx
-- Available machines: 
generic_orc;sse2_32_mmx_orc;sse3_32_orc;ssse3_32_orc;sse4_a_32_orc;sse4_1_32_orc;sse4_2_32_orc;avx_32_mmx_orc

                   Sincerely,
          Tommy James Tracy II
            PhD Student
High Performance Low Power Lab
           University of Virginia
Posted by Josh Blum (Guest)
on 2013-02-13 21:22
(Received via mailing list)
On 02/13/2013 01:59 PM, Tommy Tracy II wrote:
> The problem is that during the build, AVX support was enabled, even though my 
processor doesn't support it.

Thats intended because AVX is supported by the compiler. You should
notice however, that VOLK detected at runtime that AVX was not actually
available on your CPU.

Back to the original issue, I thought there was a segfault in one of the
profile tests. I think a gdb backtrace would be helpful to see which one
is failing.

-josh
Posted by Tommy Tracy II (Guest)
on 2013-02-13 21:47
(Received via mailing list)
What I got from GDB:
Is there any way to get more information using backtrace?

Program received signal SIGSEGV, Segmentation fault.
0xf7f77098 in volk_32fc_32f_multiply_32fc_a_generic () from 
/usr/lib/libvolk.so.0.0.0



Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

                   Sincerely,
          Tommy James Tracy II
            PhD Student
High Performance Low Power Lab
           University of Virginia
Posted by Tommy Tracy II (Guest)
on 2013-02-14 14:20
(Received via mailing list)
Nick, thank you. I was wondering why AVX showed up in the list if the 
processor didn't support it.
Does anyone have any ideas why rotatorpupper would take so long?

                   Sincerely,
          Tommy James Tracy II
            PhD Student
High Performance Low Power Lab
           University of Virginia
Posted by Nick Foster (Guest)
on 2013-02-14 14:52
(Received via mailing list)
On 02/13/2013 12:17 PM, Tommy Tracy II wrote:
> Nick, thank you. I was wondering why AVX showed up in the list if the
> processor didn't support it.
> Does anyone have any ideas why rotatorpupper would take so long?

I don't, but I'm not that worried about it as the generic implementation
is only there for backup when the hardware doesn't support the more
effective SIMD version. Generic implementation times can vary hugely
dependent on which version of GCC you're using, what optimization flags
were enabled, etc. And sometimes GCC just optimizes really, really 
terribly.

The segfault is a different story. Like Josh suggests a backtrace would
be helpful to see exactly what went wrong.

--n
Posted by Johnathan Corgan (Guest)
on 2013-02-14 16:30
(Received via mailing list)
On Wed, Feb 13, 2013 at 12:49 PM, Nick Foster <nick@ettus.com> wrote:


> The segfault is a different story. Like Josh suggests a backtrace would be
> helpful to see exactly what went wrong.
>

The generic implementation of the rotator function normally takes 5-6
seconds on a typical machine, so the 341 seconds is quite the outlier, 
and
I'd suspect there is something else going on to cause that besides poor
compiler optimization.

Regarding the segfault, yes, compiling GNU Radio with debug symbols 
turned
on and doing a backtrace under gdb would provide us more info.

This can be done by re-running CMake with -DCMAKE_BUILD_TYPE=DEBUG and
recompiling.

 Johnathan
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.