Two questions re VOLK armv7

1: To get VOLK to compile correctly under gnuradio, I need to include
-march=armv7 in CFLAGS (volk/lib/CMakeLists.txt) which I do with a tool
chain file with volk git’d outside gnuradio.

    [email protected]:/mnt/volk# rm CMakeCache.txt; cmake
    -DCMAKE_USER_MAKE_RULES_OVERRIDE=cmake/Toolchains/armv7.cmake .

What is the best way to ensure -march=armv7 gets passed to cmake when
building gnuradio? Specify CFLAGS on the command line or the
environment?

GCC says:

[email protected]:/mnt/volk# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/4.9/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: …/src/configure -v --with-pkgversion=‘Debian 4.9.2-10’
–with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs
–enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
–program-suffix=-4.9 --enable-shared --enable-linker-build-id
–libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
–with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib
–enable-nls --with-sysroot=/ --enable-clocale=gnu
–enable-libstdcxx-debug --enable-libstdcxx-time=yes
–enable-gnu-unique-object --disable-libitm --disable-libquadmath
–enable-plugin --with-system-zlib --disable-browser-plugin
–enable-java-awt=gtk --enable-gtk-cairo
–with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-armhf/jre
–enable-java-home
–with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-armhf
–with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-armhf
–with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
–enable-objc-gc --enable-multiarch --disable-sjlj-exceptions
–with-arch=armv7-a --with-fpu=vfpv3-d16 --with-float=hard
–with-mode=thumb --enable-checking=release --build=arm-linux-gnueabihf
–host=arm-linux-gnueabihf --target=arm-linux-gnueabihf
Thread model: posix
gcc version 4.9.2 (Debian 4.9.2-10)

2: Three QA tests fail:

The following kernels failed QA:

volk_32fc_magnitude_32f
volk_32f_sqrt_32f
volk_32fc_s32fc_multiply_32fc

How do I go about fixing that? I included test output below.

[email protected]:/mnt/volk# ./lib/test_all
Using Volk machine: neon_hardfp
RUN_VOLK_TESTS: volk_64u_popcntpuppet_64u(131071,1)
generic completed in 0.993ms
neon completed in 1.402ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16u_byteswappuppet_16u(131071,1)
generic completed in 0.478ms
neon completed in 0.266ms
neon_table completed in 0.271ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32u_byteswappuppet_32u(131071,1)
generic completed in 0.762ms
neon completed in 0.659ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32u_popcntpuppet_32u(131071,1)
generic completed in 0.661ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_64u_byteswappuppet_64u(131071,1)
generic completed in 1.621ms
neon completed in 1.136ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32fc_s32fc_rotatorpuppet_32fc(131071,1)
generic completed in 11.884ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8u_conv_k7_r2puppet_8u(131071,0)
generic completed in 0.002ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_fm_detectpuppet_32f(131071,1)
generic completed in 2.371ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16ic_s32f_deinterleave_real_32f(131071,1)
generic completed in 0.741ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16ic_deinterleave_real_8i(131071,1)
generic completed in 0.509ms
neon completed in 0.404ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_16ic_deinterleave_16i_x2(131071,1)
generic completed in 0.56ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16ic_s32f_deinterleave_32f_x2(131071,1)
generic completed in 1.199ms
neon completed in 1.11ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_16ic_deinterleave_real_16i(131071,1)
generic completed in 0.34ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16ic_magnitude_16i(131071,1)
generic completed in 2.14ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16ic_s32f_magnitude_32f(131071,1)
generic completed in 1.413ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16i_s32f_convert_32f(131071,1)
generic completed in 0.32ms
neon completed in 0.267ms
a_generic completed in 0.25ms
Best aligned arch: a_generic
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_16i_convert_8i(131071,1)
generic completed in 0.218ms
neon completed in 0.23ms
a_generic completed in 0.21ms
Best aligned arch: a_generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_16i_32fc_dot_prod_32fc(131071,1)
generic completed in 0.916ms
neon completed in 0.773ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32f_accumulator_s32f(131071,1)
generic completed in 0.292ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_add_32f(131071,1)
generic completed in 0.652ms
u_neon completed in 0.607ms
a_generic completed in 0.601ms
Best aligned arch: a_generic
Best unaligned arch: u_neon
RUN_VOLK_TESTS: volk_32fc_32f_multiply_32fc(131071,1)
generic completed in 1.002ms
neon completed in 0.926ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32f_log2_32f(131071,1)
generic completed in 7.741ms
neon completed in 0.888ms
u_generic completed in 7.684ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32f_expfast_32f(131071,1)
generic completed in 27.81ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_pow_32f(131071,1)
generic completed in 33.018ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_sin_32f(131071,1)
generic completed in 4.454ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_cos_32f(131071,1)
generic completed in 5.466ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_tan_32f(131071,1)
generic completed in 8.962ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_atan_32f(131071,1)
generic completed in 7.27ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_asin_32f(131071,1)
u_generic completed in 12.883ms
Best aligned arch: u_generic
Best unaligned arch: u_generic
RUN_VOLK_TESTS: volk_32f_acos_32f(131071,1)
generic completed in 13.695ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_s32f_power_32fc(131071,1)
generic completed in 163.532ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_calc_spectral_noise_floor_32f(131071,1)
generic completed in 1.123ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_s32f_atan2_32f(131071,1)
generic completed in 14.414ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_x2_conjugate_dot_prod_32fc(131071,1)
generic completed in 1.273ms
neon completed in 1.033ms
a_generic completed in 1.175ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32fc_deinterleave_32f_x2(131071,1)
neon completed in 1.234ms
generic completed in 1.372ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32fc_deinterleave_64f_x2(131071,1)
generic completed in 1.932ms
a_generic completed in 2.05ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_s32f_deinterleave_real_16i(131071,1)
generic completed in 0.653ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_deinterleave_imag_32f(131071,1)
neon completed in 0.787ms
generic completed in 0.664ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_deinterleave_real_32f(131071,1)
generic completed in 0.809ms
neon completed in 0.669ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32fc_deinterleave_real_64f(131071,1)
generic completed in 0.918ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_x2_dot_prod_32fc(131071,1)
generic completed in 1.251ms
a_generic completed in 1.258ms
neon completed in 1.102ms
neon_opttests completed in 1.176ms
neon_optfma completed in 0.995ms
neon_optfmaunroll completed in 1.212ms
Best aligned arch: neon_optfma
Best unaligned arch: neon_optfma
RUN_VOLK_TESTS: volk_32fc_32f_dot_prod_32fc(131071,1)
generic completed in 0.934ms
neon_unroll completed in 0.862ms
a_neon completed in 0.868ms
a_neonasm completed in 0.837ms
a_neonpipeline completed in 0.687ms
Best aligned arch: a_neonpipeline
Best unaligned arch: neon_unroll
RUN_VOLK_TESTS: volk_32fc_index_max_16u(131071,1)
generic completed in 0.875ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_s32f_magnitude_16i(131071,1)
generic completed in 1.588ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_magnitude_32f(131071,1)
generic completed in 1.335ms
a_generic completed in 1.295ms
neon completed in 0.673ms
neon_fancy_sweet completed in 0.924ms
offset 0 in1: 0.705791 in2: 0.707031
offset 1 in1: 1.21055 in2: 1.20703
offset 2 in1: 0.453742 in2: 0.453125
offset 3 in1: 0.970181 in2: 0.96875
offset 4 in1: 1.05658 in2: 1.05469
offset 5 in1: 0.821474 in2: 0.818359
offset 6 in1: 0.862105 in2: 0.861328
offset 7 in1: 0.897716 in2: 0.896484
offset 8 in1: 0.796805 in2: 0.796875
offset 9 in1: 0.889502 in2: 0.886719
volk_32fc_magnitude_32f: fail on arch neon
offset 0 in1: 0.705791 in2: 0.710017
offset 1 in1: 1.21055 in2: 1.2123
offset 2 in1: 0.453742 in2: 0.451831
offset 3 in1: 0.970181 in2: 0.967088
offset 4 in1: 1.05658 in2: 1.04841
offset 5 in1: 0.821474 in2: 0.826919
offset 6 in1: 0.862105 in2: 0.868165
offset 7 in1: 0.897716 in2: 0.898518
offset 8 in1: 0.796805 in2: 0.804625
offset 9 in1: 0.889502 in2: 0.897857
volk_32fc_magnitude_32f: fail on arch neon_fancy_sweet
Best aligned arch: neon
Best unaligned arch: neon
Failure on volk_32fc_magnitude_32f
RUN_VOLK_TESTS: volk_32fc_magnitude_squared_32f(131071,1)
generic completed in 0.672ms
neon completed in 0.641ms
a_generic completed in 0.669ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32fc_x2_multiply_32fc(131071,1)
generic completed in 1.774ms
a_generic completed in 1.738ms
neon completed in 1.218ms
neon_opttests completed in 1.189ms
neonasm completed in 1.173ms
Best aligned arch: neonasm
Best unaligned arch: neonasm
RUN_VOLK_TESTS: volk_32fc_x2_multiply_conjugate_32fc(131071,1)
generic completed in 1.82ms
neon completed in 1.253ms
a_generic completed in 1.794ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32fc_conjugate_32fc(131071,1)
generic completed in 0.736ms
a_neon completed in 0.749ms
a_generic completed in 0.653ms
Best aligned arch: a_generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_convert_16i(131071,1)
generic completed in 3.872ms
a_generic completed in 3.926ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_convert_32i(131071,1)
generic completed in 0.782ms
a_generic completed in 0.782ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_convert_64f(131071,1)
generic completed in 0.645ms
a_generic completed in 0.477ms
Best aligned arch: a_generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_convert_8i(131071,1)
generic completed in 2.2ms
a_generic completed in 2.111ms
Best aligned arch: a_generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_s32f_power_spectrum_32f(131071,1)
generic completed in 13.009ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_x2_square_dist_32f(131071,1)
neon completed in 0.738ms
generic completed in 0.686ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_x2_s32f_square_dist_scalar_mult_32f(131071,1)
generic completed in 0.724ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_divide_32f(131071,1)
generic completed in 1.434ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_dot_prod_32f(131071,1)
generic completed in 0.592ms
a_generic completed in 0.575ms
neonopts completed in 0.553ms
neon completed in 0.574ms
neonasm completed in 0.433ms
neonasm_opts completed in 0.507ms
Best aligned arch: neonasm
Best unaligned arch: neonasm
RUN_VOLK_TESTS: volk_32f_x2_s32f_interleave_16ic(131071,1)
generic completed in 0.7ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_interleave_32fc(131071,1)
neon completed in 0.767ms
generic completed in 0.748ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_max_32f(131071,1)
neon completed in 0.631ms
generic completed in 0.596ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_min_32f(131071,1)
neon completed in 0.721ms
generic completed in 0.625ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_multiply_32f(131071,1)
generic completed in 0.737ms
neon completed in 0.625ms
a_generic completed in 0.611ms
Best aligned arch: a_generic
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32f_s32f_normalize(131071,1)
generic completed in 0.632ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_power_32f(131071,1)
generic completed in 35.731ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_sqrt_32f(131071,1)
neon completed in 0.534ms
generic completed in 4.204ms
offset 0 in1: 0.68207 in2: 0.681641
offset 2 in1: 0.892081 in2: 0.890625
offset 3 in1: 0.61199 in2: 0.609375
offset 6 in1: 0.406933 in2: 0.405273
offset 7 in1: 0.771893 in2: 0.769531
offset 9 in1: 0.839927 in2: 0.837891
offset 10 in1: 0.833134 in2: 0.832031
offset 11 in1: 0.844083 in2: 0.84375
offset 13 in1: 0.990661 in2: 0.990234
offset 14 in1: 0.879374 in2: 0.878906
volk_32f_sqrt_32f: fail on arch neon
Best aligned arch: neon
Best unaligned arch: neon
Failure on volk_32f_sqrt_32f
RUN_VOLK_TESTS: volk_32f_s32f_stddev_32f(131071,1)
generic completed in 0.419ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_stddev_and_mean_32f_x2(131071,1)
generic completed in 0.441ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_subtract_32f(131071,1)
generic completed in 0.645ms
neon completed in 0.613ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32f_x3_sum_of_poly_32f(131071,1)
generic completed in 1.544ms
a_neon completed in 1.059ms
neonvert completed in 0.54ms
Best aligned arch: neonvert
Best unaligned arch: neonvert
RUN_VOLK_TESTS: volk_32i_x2_and_32i(131071,1)
neon completed in 0.62ms
generic completed in 0.588ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32i_s32f_convert_32f(131071,1)
generic completed in 0.432ms
a_generic completed in 0.445ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32i_x2_or_32i(131071,1)
neon completed in 0.728ms
generic completed in 0.7ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32f_x2_dot_prod_16i(131071,1)
generic completed in 0.54ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_64f_convert_32f(131071,1)
generic completed in 0.829ms
a_generic completed in 0.695ms
Best aligned arch: a_generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_64f_x2_max_64f(131071,1)
generic completed in 1.368ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_64f_x2_min_64f(131071,1)
generic completed in 1.371ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8ic_deinterleave_16i_x2(131071,1)
generic completed in 0.419ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8ic_s32f_deinterleave_32f_x2(131071,1)
generic completed in 0.686ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8ic_deinterleave_real_16i(131071,1)
generic completed in 0.31ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8ic_s32f_deinterleave_real_32f(131071,1)
generic completed in 0.409ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8ic_deinterleave_real_8i(131071,1)
generic completed in 0.173ms
neon completed in 0.157ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_8ic_x2_multiply_conjugate_16ic(131071,1)
generic completed in 2.468ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8ic_x2_s32f_multiply_conjugate_32fc(131071,1)
generic completed in 2.21ms
Best aligned arch: generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_8i_convert_16i(131071,1)
generic completed in 0.137ms
a_generic completed in 0.112ms
neon completed in 0.115ms
Best aligned arch: a_generic
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_8i_s32f_convert_32f(131071,1)
generic completed in 0.187ms
a_generic completed in 0.179ms
Best aligned arch: a_generic
Best unaligned arch: generic
RUN_VOLK_TESTS: volk_32fc_s32fc_multiply_32fc(131071,1)
generic completed in 1.461ms
neon completed in 0.747ms
a_generic completed in 1.447ms
offset 0 in1: 182.179 + 36.3883j in2: 146.351 + -160.164j
offset 1 in1: -60.14 + -277.288j in2: -195.427 + -81.677j
offset 2 in1: 268.287 + 258.704j in2: 205.207 + 82.4777j
offset 3 in1: 254.856 + 307.423j in2: 0 + 0j
offset 5 in1: -277.526 + -318.87j in2: -596.396 + -596.396j
offset 6 in1: -5.68741 + 178.404j in2: -1.73927e-08 + 5.45578e-07j
offset 7 in1: -279.86 + 44.2968j in2: 0 + 0j
offset 9 in1: -262.193 + -123.933j in2: -386.127 + -386.127j
offset 10 in1: 57.8619 + -289.394j in2: 1.76948e-07 + -8.84997e-07j
offset 11 in1: -23.4022 + 307.941j in2: 0 + 0j
volk_32fc_s32fc_multiply_32fc: fail on arch neon
Best aligned arch: neon
Best unaligned arch: neon
Failure on volk_32fc_s32fc_multiply_32fc
RUN_VOLK_TESTS: volk_32f_s32f_multiply_32f(131071,1)
generic completed in 0.444ms
u_neon completed in 0.361ms
a_generic completed in 0.349ms
Best aligned arch: a_generic
Best unaligned arch: u_neon
RUN_VOLK_TESTS: volk_32f_binary_slicer_32i(131071,1)
generic completed in 0.538ms
generic_branchless completed in 0.532ms
Best aligned arch: generic_branchless
Best unaligned arch: generic_branchless
RUN_VOLK_TESTS: volk_32f_binary_slicer_8i(131071,1)
generic completed in 0.531ms
generic_branchless completed in 0.519ms
neon completed in 0.324ms
Best aligned arch: neon
Best unaligned arch: neon
RUN_VOLK_TESTS: volk_32f_tanh_32f(131071,1)
generic completed in 12.998ms
series completed in 1.83ms
Best aligned arch: series
Best unaligned arch: series
Kernel QA finished: 3 failures out of 91 tests.
The following kernels failed QA:
volk_32fc_magnitude_32f
volk_32f_sqrt_32f
volk_32fc_s32fc_multiply_32fc

Hi Dennis,

The 32fc_s32fc_multiply_32fc should be fixed as of about 12 hours ago.
Try
updating your repo.

The other two have nothing ‘wrong’ with them. We use a QA tolerance to
check results and it just happens that the tolerance is a bit too low on
ARM. Those operations are hard to get accurate and fast because they
involve a square root. NEON happens to have an inverse square root
estimate
which we use because it’s the fastest way to get a result. Unfortunately
whenever we have to use the inverse square root estimate from NEON the
accuracy drops a bit. In these cases since we actually care about the
sqiare root we use the NEON inverse estimate which is also not
particularly
accurate. The two together make any square root-like operations in VOLK
not
particularly accurate in NEON. On the plus side it’s probably accurate
enough for radio.

Nathan

On Sat, 2015-07-04 at 16:32 -0400, West, Nathan wrote:

Hi Dennis,

The 32fc_s32fc_multiply_32fc should be fixed as of about 12 hours ago.
Try updating your repo.

Thanks. I did a pull and recompiled. Works better. The test now says:


Kernel QA finished: 2 failures out of 91 tests.
The following kernels failed QA:
volk_32fc_magnitude_32f
volk_32f_sqrt_32f

This is my ToolChain file, BTW:

[email protected]:/mnt/volk# cat cmake/Toolchains/armv7.cmake
########################################################################

Toolchain file for building native on a ARM Cortex A8 w/ NEON

Usage: cmake -DCMAKE_TOOLCHAIN_FILE=

########################################################################

Dennis G., July 2015

SET(CMAKE_CXX_FLAGS_INIT “-march=armv7 -mfpu=neon -mfloat-abi=hard”)
SET(CMAKE_C_FLAGS_INIT “-march=armv7 -mfpu=neon -mfloat-abi=hard”)

I am native compiling on a CubieBoard4:

[email protected]:/mnt/volk# uname -a
Linux cubieboard4 3.4.39 #18 SMP PREEMPT Tue Apr 21 15:01:22 CST 2015
armv7l GNU/Linux

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs