Hello group!
I tried oprofile with usrp_wfm_rx_nogui.py [1] and got interresting
results.
= Preparations =
First start oprofile as root (required) with
oprof_start
Disable the kernel image (only needed for kerenl profiling which we do
not), enable “per application profiles” and let the remaining options
disabled and start the profiler.
On a second shell prepare (as root again) a call to opcontrol to get a
exact shot of my symple run:
opcontrol --session=some_session_name
This will save all previous data to session “some_session_name”. To be
exact, the generic session “current” will be saved to that name, all
counters reset and a new session “current” will be created.
In a third shell (screen is your friend) prepare to start the FM
receiver:
$ ./usrp_wfm_rcv_nogui.py -f89.2
Note: Until now we have a running profiler which dumps data about the
whole system to the “current” session, a prepared command to create a
new session an a command line for a program ready to run.
= Profiling =
Profiling is easy:
- Start the programm
- Start a new session: # opcontrol --session=some_session_name
- Listen and/or watch GNURadio do its work for a reasonable time, say 1
minute [2] - Save all your recorded data in new session: # opcontrol
–session=gnuradio_wfm_nogui_1 - Stop the programm (^C)
= Analysis =
== opreport ==
opreport gives us info about the data we collected. Without a session
the “current” session is used, other sessions can be examined with
appended “session:my_session_name”. To get statistics for a certain
binary, just add the path to that binary. For GNURadio the binary is
python which calls GNURadio as library. The option -r reverses sorting,
printing most active parts last. The option -l lists all symbols with
the corresponding data (performance counters and percentage)
$ opreport -rl session:gnuradio_wfm_nogui_1 /usr/lib/python
== opannotate ==
Another really useful tool: opannotate. It locates the corresponding
source code via symbol tables and adds information about runtime.
It can produce annotated assembler code, or, when the binaries are not
stripped, annotated C/C++ code. Quite nice. Just add -o
my_output_directory to tell it where to store the annotated code (no, it
wont mess with the originals :-)), and of course the binary, we want our
date to be relative to the program that was running, not the whole
system:
$ mkdir annotated
$ opannotate -D smart session:gnuradio_wfm_nogui_1 --source -o annotated
/usr/bin/python
Voilà! Have a look at the files in the annotated directory. You’ll find
something like that from
gr-build/gnuradio-core/src/lib/filter/float_dotprod_sse.S :
:.loop2:
414 1.3286 : mulps (%edx), %xmm0 /* .loop2 total: 7982
25.6154 */ 1054 3.3824 : addps %xmm2, %xmm6
71 0.2278 : movaps 0x20(%eax), %xmm2
:
587 1.8838 : mulps 0x10(%edx), %xmm1
1318 4.2296 : addps %xmm3, %xmm7
159 0.5103 : movaps 0x30(%eax), %xmm3
:
334 1.0719 : mulps 0x20(%edx), %xmm2
1261 4.0467 : addps %xmm0, %xmm4
262 0.8408 : movaps 0x40(%eax), %xmm0
:
203 0.6515 : mulps 0x30(%edx), %xmm3
and so on.
You see, .loop2 in float_dotprod_sse.S has quite a lot to do.
= Interpreting data =
oprofile is a statistical profiler, so the gathered data is not exact.
You probably don’t want to rely too much on fractional portions of
numbers given. But it gives a good overview.
= Results from ./usrp_wfm_rcv_nogui.py =
I let ./usrp_wfm_rcv_nogui.py run for one minute, and the most
interresting symbols where:
1929 6.1904 libgnuradio-core.so.0.0.0 .loop1
2503 8.0325 libgnuradio-core.so.0.0.0
gr_fir_ccf_simd::filter(std::complex const*)
5656 18.1509 libgnuradio-core.so.0.0.0 .loop2
5720 18.3563 libgnuradio-core.so.0.0.0 gr_fast_atan2f(float, float)
7982 25.6154 libgnuradio-core.so.0.0.0 .loop2
Which add up to some 76 percent of run time.
- .loop1 is in gnuradio-core/src/lib/filter/fcomplex_dotprod_sse.S
- gr_fir_ccf_simd::filter(std::complex const*) is in
gnuradio-core/src/lib/filter/gr_fir_ccf_simd.cc - first .loop2 (18.1509) is in
gnuradio-core/src/lib/filter/fcomplex_dotprod_sse.S - gr_fast_atan2f(float, float) is in
gnuradio-core/src/lib/general/gr_fast_atan2f.cc - second .loop2 () is in
gnuradio-core/src/lib/filter/fcomplex_dotprod_sse.S
Patrick
[1] the FM receiver that runs on my PIII 450 without hiccups…
[2] oprofile is a statistical profiling system: It has a look at the
system from time to time, resulting in statistical data, preferably
from a long time period to get reasonable good results.
Engineers motto: cheap, good, fast: choose any two
Patrick S.
Student of Telematik, Techn. University Graz, Austria