Proposal Draft: Next Generation Digital Voice Codecs and Vocoders for Amateur Radio

Hi Folks,

I’ve written a draft proposal at http://codec2.org/
to create some good open voice codecs and solve the problem of
proprietary AMBE on D*STAR.
Please read it, and send comments directly to me - [email protected]
I’ll be at Hamvention this weekend to discuss this, too.

Thanks

Bruce P. K6BP

On Wed, May 14, 2008 at 8:32 PM, Bruce P. [email protected] wrote:

Hi Folks,

I’ve written a draft proposal at http://codec2.org/
to create some good open voice codecs and solve the problem of proprietary
AMBE on D*STAR.
Please read it, and send comments directly to me - [email protected]
I’ll be at Hamvention this weekend to discuss this, too.

Hey Bruce… Some points to add to your page:
Low density parity check codes are unpatented (by virtue of being
invented than forgotten :slight_smile: ). They have performance on par with turbo
codes, and may someday be proven to be mathematically equivalent.

The open codec world is alive, though not very large. Make sure you
talk to Jean-Marc Valin [email protected] (author of
speex).

We’ve been working on a new lowlatency codec for speech and music,
CELT, and are about to public a paper on it. (celt-codec.org).
While CELT wouldn’t be useful for narrowbanded voice some of the
components of CELT would be very useful in an AMBE killer.
(Particularly CWRS, the algebraic vector quantizer). CELT itself
could be used for a 10-20khz wide mode (40-60kbit/sec with decent SNR)
to replace FM with something with much greater quality if there was
interest… But a lower bandwidth mode will be far more interesting.
:slight_smile:

Gregory M. wrote:

We’ve been working on a new lowlatency codec for speech and music,
CELT, and are about to public a paper on it. (celt-codec.org).
While CELT wouldn’t be useful for narrowbanded voice some of the
components of CELT would be very useful in an AMBE killer.
(Particularly CWRS, the algebraic vector quantizer). CELT itself
could be used for a 10-20khz wide mode (40-60kbit/sec with decent SNR)
to replace FM with something with much greater quality if there was
interest… But a lower bandwidth mode will be far more interesting.

Would this be useful to solve the garble problem when there’s a
firefighter with a loud chainsaw in the background? That just kills the
codecs they use on APCO 25 right now.

Thanks

Bruce

Bruce-

firefighter with a loud chainsaw in the background? That just kills the
codecs they use on APCO 25 right now.

If the codec can handle music and other audio signals besides speech,
then the answer
is probably yes. That would mean the codec uses “perceptual”
techniques, which are
general, rather than specific technique based on a human vocal tract
model. The
vocal tract model is what gets codecs like AMBE and MELPe into trouble
when they
encounter non-speech sounds. MP3 uses perceptual techniques.

But like Gregory said, lower bandwidth operation is key. Without that
codecs like
will AMBE persist. MELPe goes as low as 600 bps. Essentially they
accomplish this
by making a wide range of assumptions based on speech-like signals.

-Jeff