Small cleanups for inband code

Hi,

attached are some small cleanups for the inband code. Some are just
cosmetic
changes, some give some small performance improvement (dont disassemble
a
list and assemble the same list again, just pass it on …)

Regards,

Stefan

Stefan Brüns wrote:

Hi,

attached are some small cleanups for the inband code. Some are just cosmetic
changes, some give some small performance improvement (dont disassemble a
list and assemble the same list again, just pass it on …)

Thanks Stefan!

I will look through the patch and double check everything. If all looks
good I will apply it in a developer branch and push it to the trunk.

  • George

Discuss-gnuradio mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Awesome, thanks again Stefan! I should be able to get to this by
Tuesday.

  • George

One more: This time, some changes for the uniform vectors (uv) of pmt.

Rationale: std::vector<> always initializes memory.
First, for the uvs, uninitialized may be fine for many cases, otherwise,
initialization should be requested explicitly.
Second, even for scalar types, initialization is expensive, as memset is
only
used for byte sized types, otherwise a for loop is used (even when
setting to
0).

This patch introduces a small helper class which substitutes std::vector
for
the uniform vectors.

This patch has no influence on ABI/API.

It saves about 20% of cycles for test_usrp_inband_tx. (Savings for *_rx
are
much smaller, as it only uses uv_u8, not uv_s16, but still significant)

Stefan


Discuss-gnuradio mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio

There are some brackets missing. Please apply onto inband_cleanup1.diff

Stefan

Stefan Bruens wrote:

the uniform vectors.

This patch has no influence on ABI/API.

It saves about 20% of cycles for test_usrp_inband_tx. (Savings for *_rx are
much smaller, as it only uses uv_u8, not uv_s16, but still significant)

Awesome, thanks as usual. Getting around to this… one thing though, I
think you forgot to include the patch :slight_smile:

  • George

On Fri, Jan 30, 2009 at 06:35:29PM +0100, Stefan Brüns wrote:

Here is another series of patches. These come in 3 flavours:

Stefan, thanks for these!

I’ll get this applied either today or tomorrow, and will coordinate
with George on the earlier one.

Eric

Here is another series of patches. These come in 3 flavours:

Independent, general cleanups and additions, which neither influence API
nor
ABI
c1255: Minor optimization, assume well formed list as common case
c1256: Avoid some double type casting
c1261: Add some unit test for pmt_subsetp and pmt_memq

Changes in ABI, changing no functionality
c1257: Make function arguments const& (incomplete)
c1260: Make function arguments const& (continued)

Intrusive changes (pun intended)
c1258: Switch from shared_ptr to intrusive_ptr for pmt_t

Some more comments:
Patches in the first group should need no further discussion, as these
are
just some optimizations which are completely invisible (despite speedup)
from
the outside.
The second group should be unproblematic as well. Although a recompile
of
anything dependent on pmt is needed, behaviour is unchanged.

The third one may need some more comments:
currently (trunk) pmt_t is defined as boost::shared_ptr<pmt_base>.
Although
the refcounting is quite nice, it comes at a quite high cost. Every
shared
pointer is combined from 3 objects, one being the “data” (in this case,
pmt_base), one the object for the refcounting, and the third one the
shared_ptr itself, holding pointers to two aforementioned objects. The
last
one may be created on the stack, but the other two are allocated on the
heap.

This costs are cut by half if using intrusive_ptr’s. The refcount is
stored in
the object itself. Some more speed is gained by the fact that some
operations
are just simpler for intrusive_ptr than for shared_ptr (have a look at
eg
operator= …). Although the gain is high, there are no code changes
needed,
just a simple recompile

All patches have been tested by running the mblock and pmt qa tests.


Benchmarks:
P4 HT @ 2.80GHz
test_usrp_inband_tx with a constant value for the signal generator:

Results of running “time ./test_usrp_inband_tx” (trunk)
real 0m20.670s
user 0m7.348s
sys 0m2.328s

Results of running “time ./test_usrp_inband_tx” (all patches applied)
real 0m20.574s
user 0m3.632s
sys 0m1.992s

Regards,

Stefan

PS: Not finished yet. Next is switching pmt_list* from pairs to vectors
and
mb_message to intrusive_ptr as well. Also nice to have: lockless
implementation of the message queue …