I’m working on programming the “Lazy” Viterbi (convolutional, maximum

likelihood) decoder right now. See

< http://citeseer.ist.psu.edu/599573.html >

< http://vanu.com/resources/publications/

2002fast_maximum_likelihood_decoder.pdf >

I have a few issues which I’m trying to resolve, and maybe someone in

GR-land knows the answer(s) or knows someone …

Their basic idea is to maintain 2 structures: a “trellis” for state-

transition decisions which have already been made, and a “priority

queue” (PQ) for decisions which are pending. The “Lazy” comes from

the fact that PQ entries might already be rendered moot by trellis

decisions, and thus the constraint is on trellis membership while

allowing (almost) arbitrary (“Lazy”) PQ membership. The general loop

is to “pop” the minimum PQ entry and, if it’s not in the trellis then

put it there and add all of it’s next-time transitions to the PQ; if

it is in the trellis, then some other time transition metric was at

least as good as it and thus it is dropped. This is repeated until a

transition reaches the end of the trellis (for block decoding) or a

given number of bits (for stream decoding). The math and logic here

are sound.

The trellis can be done in many ways (e.g. full or sparse matrix,

hash table), but the simplest (which I will use up front) is just a

full “matrix” via std::vector<std::vector<void*> > pointers to the

previous state, in order to provide the traceback. This structure is

simple to work with and understand; there are no issues here. All of

the issues are with the PQ structure.

The PQ holds the possible transitions, ordered by minimum metric -

where the absolute minimum (combined) metric for each set of time

transitions is always 0. For precision metrics, the PQ (I

think) has to be a (linked) list with new possible transitions

inserted via a search for the correct insertion point … not

constant time. In order to maintain (expected) constant time in the

PQ, they use integer (M-ary precision) metrics - and limit the number

of needed slots available for PQ storage (to [2^M]+1), then store the

possible transition in the slot which matches it’s accumulated

(possibly relative) metric.

The primary issue which they don’t address is: what happens when a PQ

storage slot is already occupied but another metric is computed which

would need that slot (overwrite? but which one)? There is no

mechanism described to handle this conflict, and neither of the

metrics can necessarily be dropped. (I think) It’s trivial to come

up with an example where this might happen and neither metric could

yet be dropped, so clearly they had to resolve this issue. But it’s

not in the text of the report, so I’m a-priori trying to think of a

(reasonably simple) way around it.

The first way I’ve though of is to make each PQ slot the head of a

linked list. But this requires a possibly large number of linked

nodes to be available a-priori, or calling ‘new’ a lot. An upper

bound on “large” is n*(2^m)*(2^#o), where ‘n’ is the block length or

stream decode length (in bits), ‘m’ is the total number of delays (==

constraint length - 1), and ‘#o’ is the number of output bit-streams

from the encoder; for (n,m,#o)=(100,8,2) this is 100k nodes (@ likely

8 Bytes / node == 800 kB), which is significantly more memory than

required by the rest of the decoder (~ 100 kB) - so this doesn’t seem

like the “best” way to go.

Another issue which they do not address is how the minimum metric PQ

slot is determined. The obvious way is to start at the “head” of the

PQ and work through entries until the first “real” one is found …

this is less “expensive” than adding new metrics in sorted order, but

how much so (maybe a factor of sqrt(M) on average?). And this search

has to be done on each new minimum metric to the extracted … not a

large “cost” under high SNR, but it would likely be a significant

“cost” under low SNR.

Thanks in advance for your thoughts. - MLD