Re: The GSoC project on LDPC codes

musicdenotation · November 8, 2013, 12:03am

Now I’m wondering why I didn’t get any segmentation faults.

You should be able to reproduce this by loading the alist file from the
reference site
http://www.inference.phy.cam.ac.uk/mackay/codes/alist.html (that
example may
not be a good LDPC example, but should at minimum cause a fault in
unfixed alist.cc)

The next question (tl;dr) about LDPC may not be an actual bug, but I
don’t have enough
knowledge of this topic to know for sure. All I know about the error
correcting
codes has been learned through experimentation and trial and error…

One of the error correcting codes used in P25 has a 4x8 generator matrix
and I’ve
been able successfully to get numpy to generate code words that match
what we’ve
seen over the air from analysis of actual P25 TDMA traffic. When I tried
to generate
the set of all possible code words using LDPC it produced a very
interesting result
(see below). It appears that the codewords that gr-ldpc generates really
are the same
as the ones used in P25, however the generated parity bits appear before
the user
data bits, instead of as in P25 where the parity bits are appended after
the original
data bits.

data P25 gr-ldpc
word codeword codeword
==== ======== ========
0 00000000 00000000
1 00010111 11010001
2 00101110 01110010
3 00111001 10100011
4 01001011 10110100
5 01011100 01100101
6 01100101 11000110
7 01110010 00010111
8 10001101 11101000
9 10011010 00111001
10 10100011 10011010
11 10110100 01001011
12 11000110 01011100
13 11010001 10001101
14 11101000 00101110
15 11111111 11111111
The “problem” is that if we receive say, P25 codeword 01011100 we want
it to decode
to 5 (0101) whereas if gr-ldpc is called upon to decode 01011100 its
answer is 12 .
In this small example we could clean up afterwords by adding a lookup
table
(4 bits in, 4 bits out) to map the result back to the proper value, but
it’s not clear
that would be generalizable.

I’ve pasted the python code below that’s used to generate the second
column in the
table above - the example shown is for dataword=‘0101’ (5)

import numpy as np

g = np.array(np.mat(‘1 0 0 0 1 1 0 1; 0 1 0 0 1 0 1 1; 0 0 1 0 1 1 1 0; 0 0 0
1 0 1 1 1’))

codeword = np.dot([0,1,0,1], g) % 2

print codeword
[0 1 0 1 1 1 0 0]

The question (if I understand things correctly) seems to be : how
reasonable / unreasonable is it for users of the
library to be picky about the exact ordering of the parity bits in the
generated codewords?

Thx again Manu

Best

Max

ikjtel · November 8, 2013, 7:43am

Hi,

The difference could be because of the way the encoding is implemented.
My
encoding scheme is very naive. And it need not necessarily be true
that the last K bits are the data bits. Here it happened, but the
encoder
does not have anything to make sure of this in general. If the decoder
is
able to correct all the errors in the received vector the decoding will
return the data. So even if the codewords does not match one to one with
P25, there are no worries.

One good thing about having the data (as it is) at the end/ beginning is
that once we decode the codeword correctly, then it is trivial to get
back
the data. It has as much complexity as in splitting a vector into two.
But since gr-ldpc encoding does not have this feature, (but it knows
which
all positions corresponds to data-bits and also the order) it will have
as
much complexity as in permuting a vector. It is still linear, and so not
going to be bottleneck.

So I don’t see any reason why the user has to be picky about the exact
ordering of the parity bits in the generated codeword, unless he uses
some
other decoding schemes to decode the codeword.