Forum: GNU Radio Beagle board update

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
9c2ab60bcd7045e9acc36aa8074a3336?d=identicon&s=25 Philip Balister (Guest)
on 2008-10-24 06:39
(Received via mailing list)
I have some NEON code in the fff dotproduct routine, the qa code passes:

root@beagleboard:/home/balister/oe/tmp/work/armv7a-angstrom-linux-gnueabi/gnuradio-3.1.3+svnr9809-r4.1/trunk/gnuradio-core/src/tests#
./test_filter
. [generic] [cortex_a8]
. [generic] [cortex_a8]
. [generic]
. [generic]
. [generic]
. [generic]
.>>> gr_fir_fff: using cortex_a8
..
OK (9 tests)


root@beagleboard:/home/balister/oe/tmp/work/armv7a-angstrom-linux-gnueabi/gnuradio-3.1.3+svnr9809-r4.1/trunk/gnuradio-core/src/tests#
./benchmark_dotprod_fff
    generic: taps:  256  input: 4e+07  cpu: 968.586  taps/sec:
1.057e+07
 cortex_a8: taps:  256  input: 4e+07  cpu: 45.703    taps/sec:
2.241e+08

Philip
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2008-10-24 15:28
(Received via mailing list)
On Thu, Oct 23, 2008 at 09:38:26PM -0700, Philip Balister wrote:
> .>>> gr_fir_fff: using cortex_a8
> ..
> OK (9 tests)
>
>
> 
root@beagleboard:/home/balister/oe/tmp/work/armv7a-angstrom-linux-gnueabi/gnuradio-3.1.3+svnr9809-r4.1/trunk/gnuradio-core/src/tests#
> ./benchmark_dotprod_fff
>     generic: taps:  256  input: 4e+07  cpu: 968.586  taps/sec:  1.057e+07
>  cortex_a8: taps:  256  input: 4e+07  cpu: 45.703    taps/sec:  2.241e+08
>
> Philip

Cool!

The good news / bad news is that the spread is worse than on the P4!

Is there a way to get the compiler to use the NEON instruction set in
scalar mode?  E.g., something like -mfpmath=sse on x86?  Maybe -mfp=vfp?
Are you providing the -mcpu=cortex-a8 gcc option?

Eric
9c2ab60bcd7045e9acc36aa8074a3336?d=identicon&s=25 Philip Balister (Guest)
on 2008-10-24 17:21
(Received via mailing list)
On Fri, Oct 24, 2008 at 6:27 AM, Eric Blossom <eb@comsec.com> wrote:
>> . [generic]
>> Philip
>
> Cool!
>
> The good news / bad news is that the spread is worse than on the P4!
>
> Is there a way to get the compiler to use the NEON instruction set in
> scalar mode?  E.g., something like -mfpmath=sse on x86?  Maybe -mfp=vfp?
> Are you providing the -mcpu=cortex-a8 gcc option?

The Cortex-A8 numbers use assembler to unroll the inner loop 8 times.
I think this code can get better. I'll have to double check the flags,
but I do not think gcc does a good job generating code for the
vfp/NEON unit. (We are happy gcc can generate anything supporting NEON
and not crash ...)

Remember, this is clocked at 600 MHz and consumes about 1 Watt.

Philip
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2008-10-24 17:32
(Received via mailing list)
On Fri, Oct 24, 2008 at 08:19:49AM -0700, Philip Balister wrote:
> On Fri, Oct 24, 2008 at 6:27 AM, Eric Blossom <eb@comsec.com> wrote:

> >
> Remember, this is clocked at 600 MHz and consumes about 1 Watt.
Understood.  I'm trying to keep you out of the assembly business.  The
fact that your assembly code is 20 time faster is scary.  That's why I
was asking about compiler flags.  I suspect that you're not telling
gcc enough about the machine.

Eric
This topic is locked and can not be replied to.