Here are my 2c regarding this issue:
I think the line code should be a seperate block
not related to modulation etc.
In fact this is precisely a block or a convolutional code
with the only exception that is used not for error correction but to
shape the data. Or you can think about it as an error-correcting
code against a channel that is hostile to runlengths.
If you are interested only in breaking runlengths then
the 8b/10b may not be the best choice (since it also does some
DC balancing etc; at least this is what I gathered from a quick
For instance it is easy to construct a block code
with length 10 (bits) and 2^9 codewords that has
runlengths of at most 5 bits.
This means that the throughput is only 9/10 of that
without line coding, which is better than the 8/10 of the 8b/10b code.
If you consider a code of length 14 (bits) you can do the same
(ie, have runlengths of at most 5 bits) with 2^13
codewords and a throughput of 13/14 > 9/10 > 8/10.
I am attaching a simple piece of code that shows you how
you can easily construct these codes (not optimized by any means).
The generated codes have no structure at all, so they should
be decoded using a lookup table. But I guess one can construct
also linear (or even cyclic) codes with these properties and make
the encoding/decoding simpler…
to get both the rate and the code.