Point-to-Point Wireless Communication (III):
Coding Schemes, Adaptive Modulation/Coding,
Hybrid ARQ/FEC
References
David MacKay, Information Theory, Inference & Learning Algorithms.
PDF available online:
[Link]
Chapter 1: Introduction to Information Theory
Skim Chapter 9: Communication over a noisy channel
Section 11.4: Capabilities of Practical Error Correcting Codes
Skim Chap 13: Binary codes (ideas of distance, perfect/MDS codes,
concatenation)
Skim Chapter 47: LDPC codes
Chapter 48: Convolutional & Turbo Codes
Optional browsing: Chap 49: Digital Fountain (erasure) codes
Article by Berlekamp: Application of Error Control Coding to
Communications (especially the discussions on RS coding, concatenated
codes, hybrid ARQ/FEC strategies)
Context: Time Diversity
Time diversity can be obtained by interleaving and coding
over symbols across different coherent time periods.
Channel: time
diversity/selectivity,
but correlated across
successive symbols
(Repetition) Coding
w/o interleaving: a full
codeword lost during fade
Interleaving: of sufficient depth:
(> coherence time)
At most 1 symbol of codeword lost
Coding alone is not sufficient!
What is channel coding?
Transforming signals to improve communications performance by
increasing the robustness against channel impairments (noise,
interference, fading, ..)
It is a time-diversity technique, but can be broadly thought of as
techniques to make better use of the degrees-of-freedom in channels
(eg: space-time codes)
Waveform coding: Transforming waveforms to better waveforms
Structured sequences: Transforming data sequences into better
sequences, having structured redundancy.
Better in the sense of making the decision process less subject to
errors.
Introduce constraints on transmitted codewords to have greater
distance between them
Note: Channel coding was developed in the context of AWGN channels
& we shall study them in the same context
(Modified) Block Diagram
Format
Source
encode
Channel
encode
Pulse
modulate
Bandpass
modulate
Digital demodulation
Format
Source
decode
Channel
decode
Detect
Demod.
Sample
Channel
Digital modulation
Channel Coding Schemes:
Block, Convolutional, Turbo
Coding Gain: The Value of Coding
Error
performance vs. bandwidth
Power vs. bandwidth
P
Data rate vs. bandwidth
Capacity vs. bandwidth
B
Coded
A
F
Coding gain:
For a given bit-error probability,
the reduction in the Eb/N0 that can be
realized through the use of code:
Eb
Eb
[dB]
[dB]
G [dB]
N0 u
N 0 c
B
D
Uncoded
Eb / N 0 (dB)
Coding Gain Potential
Gap-from-Shannon-limit:
@BER=10-5
9.6 + 1.59 = 11.2 dB
(about 7.8 dB if you maintain
spectral efficiency)
The Ultimate Shannon Limit
Goal: what is min Eb/No for any spectral efficiency (0)?
Spectral efficiency = B/W = log2 (1 + SNR)
where SNR = Es/No where Es=energy per symbol
Or SNR = (2 - 1)
Eb/No = Es/No * (W/B)
Lets try to appreciate what Shannons bound means
= SNR/
by designing
some simple codes and comparing it to
Eb/No = (2 - 1)/ > ln 2 = -1.59dB
the Shannon bound
Fix = 2 bits/Hz = (2 - 1)/ = 3/2 = 1.76dB
Gap-to-capacity @ BER =10-5:
9.6dB + 1.59 = 11.2 dB (without regard for spectral eff.)
or 9.6 1.76 = 7.84 dB (keeping spectral eff. constant)
Binary Symmetric Channel (BSC)
Given a BER (f), we can construct a BSC with this
BER
Reliable Disk Drive Application
We want to build a disk drive and write a GB/day for 10 years.
=> desired BER: 10-15
Physical solution: use more reliable components, reduce noise
System solution: accept noisy channel, detect/correct errors
(engineer reliability over unreliable channels)
Repetition Code (R3) & Majority Vote Decoding
AWGN:
Performance of R3
The error probability is dominated by the probability that two bits in
a block of three are flipped, which scales as f 2.
For BSC with f = 0.1, the R3 code has a probability of error, after
decoding, of pb = 0.03 per bit or 3%.
Rate penalty: need 3 noisy disks to get the loss prob down to 3%. To
get to BER: 10-15, we need 61 disks!
Coding: Rate-BER Tradeoff?
Repetition
code R3:
Lets try to design a better code: Hamming Code
Shannon: The perception that there is a necessary tradeoff between Rate and BER is
illusory! It is not true upto a critical rate, the channel capacity!
You only need to design better codes to give you the coding gain
Hamming Code: Linear Block Code
A block code is a rule for converting a sequence of source bits s, of length
K, say, into a transmitted sequence t of length N bits.
In a linear block code, the extra N-K bits are linear functions of the original
K bits; these extra bits are called parity-check bits.
(7, 4) Hamming code: transmits N = 7 bits for every K = 4 source bits.
The first four transmitted bits, t1t2t3t4, are set equal to the four source
bits, s1s2s3s4.
The parity-check bits t5t6t7 are set so that the parity within each circle
(see below) is even
Hamming Code: (Contd)
Hamming Code: Syndrome Decoding
If channel is BSC and all source vectors are equiprobable, then
the optimal decoder identifies the source vector s whose encoding
t(s) differs from the received vector r in the fewest bits.
Similar to closest-distance decision rule seen in demodulation!
Can we do it more efficiently? Yes: Syndrome decoding
Tx
The decoding task is to find the smallest set of flipped bits that can account for
these violations of the parity rules.
[The pattern of violations of the parity checks is called the syndrome: the
syndrome above is z = (1, 1, 0), because the first two circles are `unhappy'
(parity 1) and the third circle is `happy (parity 0).]
Syndrome Decoding (Contd)
Can we find a unique bit that lies inside all the
`unhappy' circles and outside all the `happy' circles?
If so, the flipping of that bit would account for the
observed syndrome.
Hamming Code: Performance
A decoding error will occur whenever the noise has flipped more than one
bit in a block of seven.
The probability scales as O(f 2), as did the probability of error for the
repetition code R3; but Hamming code has a greater rate, R = 4/7.
Dilbert Test: About 7% of the decoded bits are in error. The residual errors
are correlated: often two or three successive decoded bits are flipped
Generalizations of Hamming codes: called BCH codes
Shannons Legacy: Rate-Reliability of Codes
Noisy-channel
coding theorem:
defines
achievable
rate/reliability
regions
Note: you can
get BER as low
as desired by
designing an
appropriate code
within the
capacity region
Shannon Legacy (Contd)
The maximum rate at which communication is possible with
arbitrarily small pb is called the capacity of the channel.
BSC(f) capacity:
f = 0.1 has capacity C 0.53.
Caveats & Remarks
Strictly, the above statements might not be quite right:
Shannon proved his noisy-channel coding theorem by studying
sequences of block codes with ever-increasing block lengths,
and the required block length might be bigger than a gigabyte
(the size of our disk drive),
in which case, Shannon might say `well, you can't do it with
those tiny disk drives, but if you had two noisy terabyte drives,
you could make a single high-quality terabyte drive from them'.
Information theory addresses both the limitations and the
possibilities of communication.
Reliable communication at any rate beyond the capacity is
impossible, and that reliable communication at all rates up
to capacity is possible.
Generalize: Linear Coding/Syndrome Decoding
The first four received bits, r1r2r3r4, purport to be the four source bits; and
the received bits r5r6r7 purport to be the parities of the source bits, as defined
by the generator matrix G.
Evaluate the three parity-check bits for the received bits, r1r2r3r4, and see
whether they match the three received bits, r5r6r7.
The differences (modulo 2) between these two triplets are called the
syndrome of the received vector.
If the syndrome is zero then the received vector is a codeword, and the
most probable decoding is given by reading out its first four bits.
If the syndrome is non-zero, then the noise sequence for this block was
non-zero, and the syndrome is our pointer to the most probable error
pattern.
Linear Coding/Syndrome Decoding (Contd)
Coding:
Received vector & Syndome:
Lets now build linear codes from ground up (first principles)
The syndrome-decoding problem is to find the most
probable noise vector n satisfying the equation
Parity Check Matrix H:
Some definitions
Binary field :
The set {0,1}, under modulo 2 binary addition
and multiplication forms a field.
Addition
Multiplication
00 0
00 0
0 1 1
0 1 0
1 0 1
1 0 0
11 0
1 1 1
Binary field is also called Galois field, GF(2).
Definitions: Fields
Fields :
Let F be a set of objects on which two operations + and
. are defined.
F is said to be a field if and only if
1. F forms a commutative group under + operation.
The additive identity element is labeled 0.
a, b F a b b a F
2. F-{0} forms a commutative group under . operation.
The multiplicative identity element is labeled 1.
a, b F a b b a F
3. The operations + and . distribute:
a (b c) (a b) (a c)
Definitions: Vector Space over Fields
Vector space: (note: it mixes vectors and scalars)
Let V be a set of vectors and F a fields of elements
called scalars. V forms a vector space over F if:
1. Commutative: u, v V u v v u F
2. Closure:
a F , v V a v u V
3. Distributive:
(a b) v a v b v and a (u v) a u a v
4. Associative: a, b F , v V (a b) v a (b v)
5. Identity Element: v V, 1 v v
Vector Spaces, Subspaces
Examples of vector spaces Vn
The set of binary n-tuples, denoted by
V4 {( 0000 ), (0001 ), (0010 ), (0011 ), (0100 ), (0101 ), (0111 ),
(1000 ), (1001 ), (1010 ), (1011 ), (1100 ), (1101 ), (1111 )}
Vector subspace:
A subset S of the vector space V n is called a subspace if:
Zero: The all-zero vector is in S.
Closure: The sum of any two vectors in S is also in S.
Example:
{( 0000 ), (0101 ), (1010 ), (1111 )} is a subspaceof V4 .
Span, Bases
Spanning set:
A collection of vectors G v1 , v 2 , ,, v n
the linear combinations of which include all vectors in a
vector space V, is said to be a spanning set for V or to span
V.
Example:
(1000 ), (0110 ), (1100 ), (0011 ), (1001 ) spans V4 .
Bases:
A spanning set for V that has minimal cardinality is called
a basis for V.
Cardinality of a set is the number of objects in the set.
Example:
(1000 ), (0100 ), (0010 ), (0001 ) is a basis for
V4 .
Linear Block Codes are just Subspaces!
Linear block code (n,k)
k
2
A set C Vn with cardinality
is called a linear
block code if, and only if, it is a subspace of the
vector space Vn.
Vk C Vn
Members of C are called codewords.
The all-zero codeword is a codeword.
Any linear combination of code-words is a
codeword.
Linear block codes contd
mapping
Vk
Vn
Bases of C
Linear block codes contd
The information bit stream is chopped into blocks of k bits.
Each block is encoded to a larger block of n bits.
The coded bits are modulated and sent over channel.
The reverse procedure is done at the receiver.
Data block
Channel
encoder
k bits
n-k
Redundant bits
k
Rc
Code rate
n
Codeword
n bits
Recall: Reed-Solomon RS(N,K): Linear Algebra in
Action
>= K of N
received
RS(N,K)
FEC (N-K)
Block
Size
(N)
Lossy Network
This is linear algebra in action: design a
k-dimensional
vector sub-space out of an
Data = K
N-dimensional vector space
Recover K
data packets!
Linear block codes contd
The Hamming weight (w) of vector U, denoted by w(U), is
the number of non-zero elements in U.
The Hamming distance (d) between two vectors U and V, is
the number of elements in which they differ.
The minimum distance of a block code is
d (U, V) w(U V)
d min min d (Ui , U j ) min w(Ui )
i j
Linear block codes contd
Error detection capability is given by
e d min 1
Error correcting capability t of a code, which is defined as
the maximum number of guaranteed correctable errors per
codeword, is
d m in 1
t
2
Linear block codes contd
Vn
mapping
Vk
Bases of C
matrix G is constructed by taking as its rows
the vectors on the basis,{V1 , V2 ,, Vk } .
v11
V1
v21
G
Vk
vk1
v12
v22
vk 2
v1n
v2 n
vkn
Linear block codes contd
Encoding in (n,k) block code
U mG
V1
V
(u1 , u2 ,, un ) (m1 , m2 ,, mk ) 2
Vk
(u1 , u2 ,, un ) m1 V1 m2 V2 m2 Vk
The
rows of G, are linearly independent.
Linear block codes contd
Example: Block code (6,3)
Message vector
V1 1 1 0 1 0 0
G V2 0 1 1 0 1 0
V3 1 0 1 0 0 1
Codeword
000
100
000000
110100
010
011010
110
001
1 01 1 1 0
1 01 0 0 1
101
011
111
0 111 0 1
1 1 0 011
0 0 0 111
Systematic Block Codes
Systematic block code (n,k)
For a systematic code, the first (or last) k elements in
the codeword are information bits.
G [P I k ]
I k k k identity matrix
Pk k (n k ) matrix
U (u1 , u2 ,..., un ) ( p1 , p2 ,..., pn k , m1 , m2 ,..., mk )
parity bits
m essagebits
Linear block codes contd
For any linear code we can find an matrix H ( n k )which
its
n
rows are orthogonal to rows of G
GH 0
T
Why? H checks the parity of the received word (i.e. maps the
N-bit word to a M-bit syndrome).
Codewords (=mG) should have parity of 0 (i.e. null-space).
H is called the parity check matrix and its rows are linearly
independent.
For systematic linear block codes:
H [I n k
PT ]
Linear block codes contd
Data source
Format
Channel
encoding
Modulation
channel
Data sink
Format
Channel
decoding
Demodulation
Detection
r Ue
r (r1 , r2 ,....,rn ) received codewordor vector
e (e1 , e2 ,....,en ) error pattern or vector
Syndrome testing:
S is syndrome of r, corresponding to the error pattern e.
S rHT eHT
Linear block codes contd
Error pattern
Syndrome
000000
000001
000
101
000010
000100
001000
011
110
001
010000
100000
010
100
010001
111
U (101110) transmitted.
r (001110) is received.
The syndromeof r is computed:
S rHT (001110)H T (100)
Error pattern corresponding to this syndromeis
e (100000)
The correctedvectoris estimated
r e (001110) (100000) (101110)
U
There is a unique mapping from Syndrome Error Pattern
Standard Array: Error Patterns
Example: Standard array for the (6,3) code
codewords
000000 110100 011010 101110 101001 011101 110011 000111
000001 110101 011011 101111 101000 011100 110010 000110
000010 110110 011000 101100 101011 011111 110001 000101
000100 110000 011110 101010 101101 011010 110111 000110
001000 111100
010000 100100
100000 010100
010001 100101
Coset leaders
(error patterns)
Coset:
Error pattern +
codeword
010110
Linear block codes contd
Standard array
1.
2.
nk
For row i 2,3,..., 2
, find a vector in Vn of minimum
weight which is not already listed in the array.
Call this error pattern e i and form the i : th row as the
corresponding coset
zero
codeword
coset leaders
U1
U2
e2
e2 U 2
e 2 nk
U 2k
e 2 U 2k
e 2 nk U 2 e 2 nk U 2 k
coset
Linear block codes contd
Standard array and syndrome table decoding
1. Calculate syndrome S rHT
2. Find the coset leader, e e i , corresponding to S .
r e and corresponding m
3. Calculate U
.
r e (U e) e U (e e )
Note that U
If e e , error is corrected.
If e e , undetectable decoding error occurs.
Hamming codes
Hamming codes
Hamming codes are a subclass of linear block codes and
belong to the category of perfect codes.
Hamming codes are expressed as a function of a single
integer m 2 , i.e. n and k are derived from m:
Code length :
n 2m 1
Number of information bits : k 2 m m 1
Number of parity bits :
n-k m
Error correctioncapability : t 1
The columns of the parity-check matrix, H, consist of all
non-zero binary m-tuples.
Hamming codes
Example: Systematic Hamming code (7,4)
1 0 0 0 1 1 1
H 0 1 0 1 0 1 1 [I 33
0 0 1 1 1 0 1
0 1 1 1 0 0 0
1 0 1 0 1 0 0
[P
G
1 1 0 0 0 1 0
1 1 1 0 0 0 1
PT ]
I 44 ]
Cyclic block codes
Cyclic codes are a subclass of linear block codes.
Encoding and syndrome calculation are easily
performed using feedback shift-registers.
Hence, relatively long block codes can be
implemented with a reasonable complexity.
BCH and Reed-Solomon codes are cyclic codes.
Cyclic block codes
A linear (n,k) code is called a Cyclic code if all
cyclic shifts of a codeword are also a codeword.
U (u0 , u1 , u2 ,...,un1 )
i cyclic shifts of U
U (i ) (uni , uni 1 ,...,un1 , u0 , u1 , u2 ,...,uni 1 )
Example:
U (1101)
U(1) (1110) U( 2) (0111) U(3) (1011) U( 4) (1101) U
Cyclic block codes
Algebraic structure of Cyclic codes, implies expressing codewords in
polynomial form
U( X ) u0 u1 X u2 X 2 ... un 1 X n 1
degree (n-1)
Relationship between a codeword and its cyclic shifts:
XU( X ) u0 X u1 X 2 ...,un 2 X n 1 un 1 X n
un 1 u0 X u1 X 2 ... un 2 X n 1 un 1 X n un 1
U (1 ) ( X )
u n1 ( X n 1)
U (1) ( X ) un1 ( X n 1)
Hence:
U (1) ( X ) XU( X ) modulo ( X n 1)
By extension
U (i ) ( X ) X i U( X ) modulo ( X n 1)
Cyclic block codes
Basic properties of Cyclic codes:
Let C be a binary (n,k) linear cyclic code
1. Within the set of code polynomials in C,
there is a unique monic polynomial g( X )
with minimal degree r n. g( X ) is called
the generator polynomials.
g ( X ) g 0 g1 X ... g r X r
2. Every code polynomial U(X ) in C, can be
expressed uniquely as U( X ) m( X )g( X )
3. The generator polynomial g( X ) is a factor
of X n 1
Cyclic block codes
4. The orthogonality of G and H in
polynomial form is expressed as g( X )h( X ) X n 1
This means h( X ) is also a factor of X n 1
5. The row i, i 1,...,k , of generator matrix is
formed by the coefficients of the "i 1"
cyclic shift of the generator polynomial.
0
g 0 g1 g r
g
(
X
)
Toeplitz Matrix
(like
g 0 g1 matrix):
g r Efficient Linear Algebra
Xg( X ) the circulant
b) etc possible
G (multiplication,
solution
of Ax =
Operations
inverse,
g
g
g
k 1
0
1
r
X
g
(
X
)
g 0 g1 g r
0
Cyclic block codes
Systematic encoding algorithm for an (n,k)
Cyclic code:
1. Multiply the message polynomial m(X ) by X nk
2.
Divide the result of Step 1 by the generator
polynomial g ( X ). Let p( X ) be the reminder.
3.
Add p( X ) to X n k m( X ) to form the codeword U(X )
Remember CRC used to detect errors in packets?
Cyclic Redundancy Check: same idea!
Cyclic block codes
Example: For the systematic (7,4) Cyclic code with generator
polynomial g ( X ) 1 X X
1. Find the codeword for the message m (1011)
3
n 7, k 4, n k 3
m (1011) m( X ) 1 X 2 X 3
X n k m( X ) X 3m( X ) X 3 (1 X 2 X 3 ) X 3 X 5 X 6
Divide X n k m( X ) by g( X) :
X 3 X 5 X 6 (1 X X 2 X 3 )(1 X X 3 )
1
quotient q(X)
generatorg(X)
Form the codewordpolynomial :
U( X ) p( X ) X 3m( X ) 1 X 3 X 5 X 6
U (1
00 1011 )
parity bits m essagebits
rem ainderp ( X )
Example: Encoding of systematic cyclic codes
Decoding cyclic codes
s( x) mod r ( x) / g ( x)
g ( x)
Table 16.6
Cyclic block codes
2.
Find the generator and parity check matrices, G and H,
respectively.
g( X ) 1 1 X 0 X 2 1 X 3 ( g 0 , g1 , g 2 , g 3 ) (1101)
1
0
G
0
1
1
0
0
1
0
G
1
1 0 1 0 0 0
1 1 0 1 0 0
1 1 0 0 1 0
0 1 0 0 0 1
0
1
1
0
1
0
1
1
0
1
0
1
0
0
1
0
I 4 4
0
0
0
Not in systematic form.
We do the following:
row(1) row(3) row(3)
row(1) row(2) row(4) row(4)
1 0 0 1 0 1 1
H 0 1 0 1 1 1 0
0 0 1 0 1 1 1
I 33
PT
Cyclic block codes
Syndrome decoding for Cyclic codes:
Received codeword in polynomial form is given by
Received
codeword
r( X ) U( X ) e( X )
The syndrome is the reminder obtained by dividing the received
polynomial by the generator polynomial.
r( X ) q( X )g( X ) S( X )
Error
pattern
Syndrome
With syndrome and Standard array, error is estimated.
In
Cyclic codes, the size of standard array is considerably
reduced.
Example of the block codes
PB
8PSK
QPSK
Eb / N 0 [dB]
Well-known Cyclic Codes
(n,1) Repetition codes. High coding gain, but low rate
(n,k) Hamming codes. Minimum distance always 3. Thus can detect 2
errors and correct one error. n=2m-1, k = n - m, m 3
Maximum-length codes. For every integer k 3 there exists a maximum
length code (n,k) with n = 2k - 1,dmin = 2k-1. Hamming codes are dual of
maximal codes.
BCH-codes. For every integer m 3 there exist a code with n = 2m-1,
k n mt and d min 2t 1 where t is the error correction capability
(n,k) Reed-Solomon (RS) codes. Works with k symbols that consists of m
bits that are encoded to yield code words of n symbols. For these codes
n 2m 1, number of check symbols n k 2t and d min 2t 1
BCH and RS are popular due to large dmin, large number of codes, and easy
generation
Reed-Solomon Codes (RS)
Group bits into L-bit symbols. Like BCH codes with symbols rather than single bits.
Can tolerate burst error better (fewer symbols in error for a given bit-level burst
event).
Shortened RS-codes used in CD-ROMs, DVDs etc
Shortened Reed Solomon Codes
RS(N,K)
0
0
0
0
0
0
RS(N,K)
Zeros (z)
FEC (F =N-K)
K=d+z
Block
Size
(N)
Data = d
RS-code performance
Longer blocks, better performance
Encoding/decoding complexity lower for higher code rates (i.e. > ): O{K(N-K) log2N}.
5.7-5.8 dB coding gain @ BER = 10-5 (similar to 5.1 dB for convolutional codes, see later)
Convolutional Codes
Block vs
convolutional coding
k bits
(n,k)
encoder
(n,k) block codes: Encoder output of
n bits depends only on the k input bits
k input bits
(n,k,K) convolutional codes:
each source bit influences n(K+1)
encoder output bits
n(K+1) is the constraint length
K is the memory depth
n output bits
n bits
input bit
n(K+1) output bits
Block diagram: Convolutional Coding
Information
source
Rate 1/n
Conv. encoder
Modulator
U G(m)
m (m1 , m2 ,...,mi ,...)
Codeword sequence
U i u1i ,...,uji ,...,uni
Branch word ( n coded bits)
Information
sink
Rate 1/n
Conv. decoder
(m 1 , m 2 ,..., m i ,...)
m
Demodulator
Z ( Z1 , Z 2 , Z 3 ,...,Z i ,...)
receivedsequence
Zi
Dem odulator outputs
for Branch word i
z1i ,...,zji ,...,zni
n outputs per Branch word
Channel
(U1 ,U 2 ,U 3 ,...,U i ,...)
Input sequence
Convolutional codes-contd
A Convolutional code is specified by three parameters
(n, k , K ) or (k / n, K ) where
Rc k / n is the coding rate, determining the
number of data bits per coded bit.
In practice, usually k=1 is chosen and we
assume that from now on.
K is the constraint length of the encoder a where
the encoder has K-1 memory elements.
A Rate Convolutional encoder
Convolutional encoder (rate , K=3)
3 bit shift-register where the first one takes the incoming
data bit and the rest form the memory of the encoder.
u1
First coded bit
(Branch word)
Output coded bits
Input data bits
u1 ,u2
u2
Second coded bit
A Rate Convolutional encoder
m (101)
Message sequence:
Time
Output
(Branch word)
Time
Output
(Branch word)
u1
t1
u1
u1 u2
1 0 0
1 1
t2
0 1 0
u2
u1
u1 u2
1 0 1
0 0
u2
1 0
u2
u1
t3
u1 u2
t4
u1 u2
0 1 0
1 0
u2
A Rate Convolutional encoder (contd)
Time
Output
(Branch word)
Time
Output
(Branch word)
u1
t5
u1
u1 u2
0 0 1
1 1
t6
u2
m (101)
u1 u2
0 0 0
0 0
u2
Encoder
U (11 10 00 10 11)
n = 2, k = 1, K = 3,
L = 3 input bits -> 10 output bits
Effective code rate
Initialize the memory before encoding the first bit (all-zero)
Clear out the memory after encoding the last bit (all-zero)
Hence, a tail of zero-bits is appended to data bits.
data
Encoder
tail
codeword
Effective code rate :
L is the number of data bits and k=1 is assumed:
Reff
m (101)
L
Rc
n( L K 1)
Encoder
U (11 10 00 10 11)
Example: n = 2, k = 1, K = 3, L = 3 input bits.
Output = n(L + K -1) = 2*(3 + 3 1) = 10 output bits
Encoder representation
Vector representation:
We define n binary vector with K elements (one vector for
each modulo-2 adder).
The i:th element in each vector, is 1 if the i:th stage in the
shift register is connected to the corresponding modulo-2
adder, and 0 otherwise.
Example:
g1 (111)
g 2 (101)
u1
m
u1 u 2
u2
Encoder representation: Impulse Response
Impulse response representaiton:
The response of encoder to a single one bit that goes
through it.
Example:
Branch word
Register
contents
Input sequence :
1 0 0
Output sequence : 11 10 11
Input m
Output
1 11 10 11
0
00 00 00
1
11 10 11
Modulo-2 sum: 11 10 00 10 11
u1
u2
100
1 1
010
001
1 0
1 1
Encoder representation: Polynomial
Polynomial representation:
We define n generator polynomials, one for each modulo-2
adder. Each polynomial is of degree K-1 or less and
describes the connection of the shift registers to the
corresponding modulo-2 adder.
Example:
g1 ( X ) g 0(1) g1(1) . X g 2(1) . X 2 1 X X 2
g 2 ( X ) g 0( 2) g1( 2) . X g 2( 2) . X 2 1 X 2
The output sequence is found as follows:
U( X ) m( X )g1 ( X ) interlaced with m( X )g 2 ( X )
Encoder representation contd
In more details:
m( X )g1 ( X ) (1 X 2 )(1 X X 2 ) 1 X X 3 X 4
m( X )g 2 ( X ) (1 X 2 )(1 X 2 ) 1 X 4
m( X )g1 ( X ) 1 X 0. X 2 X 3 X 4
m( X )g 2 ( X ) 1 0. X 0. X 2 0. X 3 X 4
U( X ) (1,1) (1,0) X (0,0) X 2 (1,0) X 3 (1,1) X 4
U 11
10
00
10
11
State diagram
A finite-state machine only encounters a finite number
of states.
State of a machine: the smallest amount of
information that, together with a current input to the
machine, can predict the output of the machine.
In a convolutional encoder, the state is represented by
the content of the memory.
K 1
Hence, there are 2
states. (grows exponentially w/
constraint length)
State diagram contd
0/00
Input
1/11
Output
(Branch word)
S0
00
0/11
1/00
S2
S1
10
01
0/10
1/01
S3
11
Current
state
input
Next
state
output
S0
0
1
0
00
11
11
S1
01
S102
1
0
1
S0
S2
S0
11
S
0
1
00
0/01
3
1/10
S2
S1
S3
S1
S3
00
10
01
01
10
Trellis contd
Trellis diagram is an extension of the state diagram that
shows the passage of time.
Example of a section of trellis for the rate code
Branch
0/00
State
S 0 00
1/11
S 2 10
0/11
S1 01
1/01
1/00
0/10
0/01
S 3 11
1/10
ti
ti 1
Time
Trellis contd
A trellis diagram for the example code
Tail bits
Input bits
1
00
10
11
0/00
0/00
0/00
Output bits
11
10
0/00
0/00
1/11
1/11
0/11
1/00
1/01
0/11
1/00
0/10
0/01
t1
1/11
1/01
0/11
1/00
0/10
0/01
t2
1/11
1/01
0/11
1/00
0/10
0/01
t3
1/11
1/01
0/11
1/00
0/10
1/01
0/01
t4
0/10
0/01
t5
t6
Trellis contd
Tail bits
Input bits
1
00
10
11
0/00
0/00
0/00
0/11
0/11
Output bits
11
10
0/00
0/00
1/11
1/11
1/11
0/10
0/11
1/00
1/01
1/01
0/01
0/01
t1
t2
t3
0/10
0/10
t4
Path through the trellis
t5
t6
Optimum decoding
If the input sequence messages are equally likely, the
optimum decoder which minimizes the probability of error is
the Maximum likelihood decoder.
ML decoder, selects a codeword among all the possible
(m )
p
(
Z
|
U
)
codewords which maximizes the likelihood function
where Z is the received sequence and U(m ) is one of the
possible codewords:
2L
codewords
to search!!!
ML decoding rule:
Choose U( m) if p(Z | U( m) ) max(m) p(Z | U( m) )
over all U
ML decoding for memory-less channels
Due to the independent channel statistics for memoryless channels, the
likelihood function becomes
p( Z | U
( m)
) pz1 , z2 ,..., zi ,... ( Z1 , Z 2 ,...,Zi ,...| U
( m)
) p( Z i | U
( m)
i
i 1
i 1
j 1
) p( z ji | u (jim) )
and equivalently, the log-likelihood function becomes
U (m) log p (Z | U
(m)
Path metric
) log p ( Z i | U
i 1
(m)
i
Branch metric
) log p ( z ji | u (jim ) )
i 1 j 1
Bit metric
The path metric up to time index "i", is called the partial path metric.
ML decoding rule:
Choose the path with maximum metric among
all the paths in the trellis.
This path is the closest path to the transmitted sequence.
AWGN channels
For BPSK modulation the transmitted sequence
corresponding to the codeword U (m) is denoted by
where S ( m ) ( S1( m ) , S 2( m ) ,..., Si( m ) ,...) and Si ( m) (s1(im) ,...,s (jim) ,...,sni( m) )
and sij Ec
.
The log-likelihood function becomes
U (m) z ji s (jim ) Z, S ( m )
Inner product or correlation
between Z and S
i 1 j 1
Maximizing the correlation is equivalent to minimizing the
Euclidean distance.
ML decoding rule:
Choose the path which with minimum Euclidean distance
to the received sequence.
The Viterbi algorithm
The Viterbi algorithm performs Maximum likelihood decoding.
It find a path through trellis with the largest metric (maximum
correlation or minimum distance).
It processes the demodulator outputs in an iterative manner.
At each step in the trellis, it compares the metric of all paths
entering each state, and keeps only the path with the largest
metric, called the survivor, together with its metric.
It proceeds in the trellis by eliminating the least likely paths.
It reduces the decoding complexity to L2 K 1 !
The Viterbi algorithm - contd
A.
Viterbi algorithm:
Do the following set up:
For a data block of L bits, form the trellis. The trellis has
L+K-1 sections or levels and starts at time t1 and ends up at
time t L K .
Label all the branches in the trellis with their corresponding
branch metric.
For each stateK in
the trellis at the time ti which is denoted by
1
S (ti ) {0,1,..., 2 }, define a parameter (path metric) S (ti ), ti
B. Then, do the following:
The Viterbi algorithm - contd
Set (0, t1 ) 0 and i 2.
At time ti , compute the partial path metrics for all the
paths entering each state.
3. Set S (ti ), ti equal to the best partial path metric
entering each state at time ti .
Keep the survivor path and delete the dead paths from the
trellis.
4. If i L K , increase i by 1 and return to step 2.
C. Start at state zero at time t L K. Follow the surviving branches
backwards through the trellis. The path thus defined is
unique and correspond to the ML codeword.
1.
2.
Example of Viterbi decoding
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
0/00
0/00
0/00
1/11
1/11
t2
0/00
0/11
0/11
1/11
0/10
0/11
1/00
1/01
0/10
1/01
0/01
t1
0/00
t3
0/10
0/01
t4
t5
t6
Viterbi decoding-contd
Label al the branches with the branch metric (Hamming distance)
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
0
S (ti ), ti
1
1
0
2
1
1
1
t1
t2
t3
t4
t5
t6
Viterbi decoding-contd
i=2
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
0
1
1
2
1
1
1
t1
t2
t3
t4
t5
t6
Viterbi decoding-contd
i=3
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
1
1
2
1
2
2
1
1
t1
t2
t3
t4
t5
t6
Viterbi decoding-contd
i=4
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
0
1
1
1
0
2
2
1
t1
t2
t3
t4
t5
t6
Viterbi decoding-contd
i=5
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
1
1
1
0
2
2
1
t1
t2
t3
t4
t5
t6
Viterbi decoding-contd
i=6
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
0
1
1
1
0
2
2
1
t1
t2
t3
t4
t5
t6
Viterbi decoding-contd
Trace back and then:
(100 )
m
(11 10 11 00 00 )
U
0
vs
3
m (101)
U (11 10 00 10 11)
Z (11 10 11 10 01)
1
1
0
2
2
1
t1
t2
t3
t4
t5
t6
Soft and hard decisions
Hard decision:
The demodulator makes a firm or hard decision whether one or zero is
transmitted and provides no other information reg. how reliable the decision is.
Hence, its output is only zero or one (the output is quantized only to two level)
which are called hard-bits.
Soft decision:
The demodulator provides the decoder with some side information together with
the decision.
The side information provides the decoder with a measure of confidence for the
decision.
The demodulator outputs which are called soft-bits, are quantized to more than
two levels. (eg: 8-levels)
Decoding based on soft-bits, is called the soft-decision decoding.
On AWGN channels, 2 dB and on fading channels 6 dB gain are obtained by
using soft-decoding over hard-decoding!
Performance bounds
Basic coding gain (dB) for soft-decision Viterbi
decoding
Uncoded
Code rate
1/ 3
1/ 2
Eb / N 0
(dB)
PB
6.8
103
9.6
11.3
Upper bound
4.2 4.4
3.5
3.8
105
5.7 5.9
4.6
5.1
107
6.2 6.5
5.3 5.8
7.0 7.3 6.0
7.0
Interleaving
Convolutional codes are suitable for memoryless channels
with random error events.
Some errors have bursty nature:
Statistical dependence among successive error events
(time-correlation) due to the channel memory.
Like errors in multipath fading channels in wireless
communications, errors due to the switching noise,
Interleaving makes the channel looks like as a memoryless
channel at the decoder.
Interleaving
Consider a code with t=1 and 3 coded bits.
A burst error of length 3 can not be corrected.
A1 A2 A3 B1 B2 B3 C1 C2 C3
2 errors
Let us use a block interleaver 3X3
A1 A2 A3 B1 B2 B3 C1 C2 C3
Interleaver
A1 B1 C1 A2 B2 C2 A3 B3 C3
A1 B1 C1 A2 B2 C2 A3 B3 C3
Deinterleaver
A1 A2 A3 B1 B2 B3 C1 C2 C3
1 error
1 error
1 error
Concatenated Codes
Concatenated codes
A concatenated code uses two levels on coding, an inner code and an
outer code (higher rate).
Popular concatenated codes: Convolutional codes with Viterbi
decoding as the inner code and Reed-Solomon codes as the outer
code
The purpose is to reduce the overall complexity, yet achieving the
required error performance.
Outer
encoder
Interleaver
Inner
encoder
Modulate
Output
data
Outer
decoder
Deinterleaver
Inner
decoder
Demodulate
Channel
Input
data
Concatenated Codes
Encoder-channel-decoder
system C Q D can be
viewed as defining a superchannel Q with a smaller
probability of error, and with
complex correlations among its
errors.
We can create an encoder C
and decoder D for this superchannel Q.
Product/Rectangular Codes: Concatenation +
Interleaving
Some concatenated codes make use of the idea of interleaving.
Blocks of size larger than the block lengths of the constituent
codes C and C.
After encoding the data of one block using code C,
the bits are reordered within the block in such a way that nearby bits
are separated from each other once the block is fed to the second code C.
A simple example of an interleaver is a rectangular code or
product code in which
the data: K2 x K1 rectangular block, and
encoded horizontally using an (N1,K1) linear code,
then vertically using a (N2,K2) linear code.
Product code Example
(a) A string 1011 encoded using a concatenated code
w/ two Hamming codes, H(3, 1) Repetition code
(R3) and H(7,4).
(b) a noise pattern that flips 5 bits.
(c) The received vector.
Product Codes (Contd)
(d) After decoding using the horizontal (3, 1)
decoder, and
(e) after subsequently using the vertical (7; 4)
decoder.
The decoded vector matches the original.
Note: Decoding in the other order (weaker-codefirst) leads to residual error in this example:
Practical example: Compact disc
Without error correcting codes, digital audio
would not be technically feasible.
Channel in a CD playback system consists of a transmitting laser, a recorded
disc and a photo-detector.
Sources of errors are manufacturing damages, fingerprints or scratches
Errors have bursty like nature.
Error correction and concealment is done by using a concatenated error
control scheme, called cross-interleaver Reed-Solomon code (CIRC).
Both the inner and outer codes are shortened RS codes
Compact disc CIRC Encoder
CIRC encoder and decoder:
Encoder
interleave
C2
encode
D*
interleave
C1
encode
D
interleave
deinterleave
C2
decode
D*
deinterleave
C1
decode
D
deinterleave
Decoder
Adaptive Modulation and Coding
Adaptive Modulation
Just vary the M in the MQAM constellation to the
appropriate SNR
Can be used in conjunction with spatial diversity
Adaptive modulation/coding: Multi-User
Exploit multi-user diversity.
Users with high SNR: use MQAM (large M) +
high code rates
Users with low SNR: use BPSK + low code
rates (i.e. heavy error protection)
In any WiMAX frame, different users (assigned to
time-frequency slots within a frame) would be
getting a different rate!
i.e. be using different code/modulation combos..
Basis for Adaptive Modulation/Coding (AMC)
K-user system: the subcarrier of
interest experiences i.i.d.
Rayleigh fading: each users
channel gain is independent of
the others, and is denoted by hk.
Wimax: Uses Feedback & Burst Profiles
Lower data rates are achieved by using a small constellation such as QPSK and
low rate error correcting codes such as rate 1/2 convolutional or turbo codes.
The higher data rates are achieved with large constellations such as 64QAM and
less robust error correcting codes, for example rate 3/4 convolutional, turbo, or
LDPC codes.
Wimax burst profiles: 52 different possible configurations of modulation order and
coding types and rates.
WiMAX systems heavily protect the feedback channel with error correction, so
usually the main source of degradation is due to mobility, which causes channel
estimates to rapidly become obsolete.
AMC Considerations
BLER and Received SINR: In adaptive modulation theory, the transmitter
needs only to know the statistics and instantaneous channel SINR. From the
channel SINR, it can determine the optimum coding/modulation strategy
and transmit power.
In practice however, the BLER should be carefully monitored as the
final word on whether the data rate should be increased (if the BLER is
low) or decreased to a more robust setting.
Automatic Repeat Request (ARQ): ARQ allows rapid retransmissions,
and Hybrid ARQ generally increases the ideal BLER operating point by
about a factor of 10, e.g. from 1% to 10%.
For delay-tolerant applications, it may be possible to accept a BLER
approaching even 70%, if Chase combining is used in conjunction with
HARQ to make use of unsuccessful packets.
Power control vs. Waterfilling: In theory, the best power control policy
from a capacity standpoint is the so-called waterfilling strategy, in which
more power is allocated to strong channels, and less power allocated to
weak channels. In practice, the opposite may be true in some cases.
AMC vs Shannon Limit
Optionally turbo-codes or LDPC codes can be used instead of simple
block/convolutional codes in these schemes
Main Points
Adaptive MQAM uses capacity-achieving power and rate
adaptation, with power penalty K.
Adaptive MQAM comes within 5-6 dB of capacity
Discretizing the constellation size results in negligible
performance loss.
Constellations cannot be updated faster than 10s to 100s of
symbol times: OK for most dopplers.
Estimation error and delay lead to irreducible error floors.
Hybrid ARQ/FEC
Type I HARQ: Chase Combining
In Type I HARQ, also referred to as Chase Combining, the redundancy
version of the encoded bits is not changed from one transmission to the
next, i.e. the puncturing patterns remains same.
The receiver uses the current and all previous HARQ transmissions of the
data block in order to decode it.
With each new transmission the reliability of the encoded bits improve thus
reducing the probability of error during the decoding stage.
This process continues until either the block is decoded without error
(passes the CRC check) or the maximum number of allowable HARQ
transmissions is reached.
When the data block cannot be decoded without error and the maximum
number of HARQ transmissions is reached, it is left up to a higher layer
such as MAC or TCP/IP to retransmit the data block.
In that case all previous transmissions are cleared and the HARQ process
start from the beginning.
Used in WiMAX implementations: can provide range extension (especially
at cell-edge).
Type II ARQ: Incremental Redundancy
Type II HARQ is also referred to as Incremental Redundancy
The redundancy version of the encoded bits is changed from one transmission to the
next. (Rate-compatible Punctured Convolutional codes (RCPC)) used.
Thus the puncturing pattern changes from one transmission to the next.
This not only improves the log likelihood estimates (LLR) of parity bits but also
reduces the code rate with each additional transmission.
Incremental redundancy leads to lower bit error rate (BER) and block error rate
(BLER) compared to chase combining.
Wimax uses only Type I HARQ (Chase) and not Type II for complexity reasons
Hybrid ARQ/FEC: Combining Coding w/ Feedback
Packets
Timeout
Status Reports
Sequence Numbers
CRC or Checksum
Proactive FEC
ACKs
NAKs,
SACKs
Bitmaps
Retransmissions
Packets
Reactive FEC
Hybrid ARQ/FEC For TCP over Lossy Networks
PROACTIVE
FEC (PFEC)
Pfec= f(,)
REACTIVE
FEC (RFEC)
Y = g(p,,X)
Loss-Tolerant TCP (LT-TCP) vs TCP-SACK
Maximum
Goodput
Missing
Goodput!
Tradeoffs in Hybrid ARQ/FEC
Analysis : (10 Mbps, p = 50%)
Goodput = 3.61 Mbps vs 5 Mbps (max)
PFEC waste: 1.0 Mbps = 10%
RFEC waste: 0.39 Mbps = 3.9%
Residual Loss : 0.0%
1.4Mbps goodput sacrificed
(FEC waste) to reduce
latency, residual loss
PFEC: + of loss process
Upfront PFEC waste (10%)
dominates RFEC waste
Residual Loss can be negligible
even for high loss rates (50%), even
with a limit of just 1 ARQ attempt.
Weighted Avg # Rounds: 1.13
Tradeoffs
Goodput
Residual
Loss Rate
Block
recovery
latency
Towards the Shannon Limit!
LDPC, Turbo Codes, Digital Fountains
Recall: Coding Gain Potential
Gap-from-Shannon-limit:
@BER=10-5
9.6 + 1.59 = 11.2 dB
(about 7.8 dB if you maintain
spectral efficiency)
With convolutional code alone, @BER of 10-5, we require Eb/No of 4.5dB or get a
gain of 5.1 dB.
With concatenated RS-Convolutional code, BER curve ~ vertical cliff at an Eb/No
of about 2.5-2.6 dB, i.e a gain of 7.1dB.
We are still 11.2 7.1 = 4.1 dB away from the Shannon limit
Turbo codes and LDPC codes get us within 0.1dB of the Shannon limit !!
Low-Density Parity Check (LDPC) Codes
LDPC
Example LDPC Code
A low-density parity-check matrix and the corresponding (bipartite) graph
of a rate-1/4 low-density parity-check code with blocklength N =16, and M
=12 constraints.
Each white circle represents a transmitted bit.
Each bit participates in j = 3 constraints, represented by squares.
Each constraint forces the sum of the k = 4 bits to which it is connected to
be even.
This code is a (16; 4) code. Outstanding performance is obtained when the
blocklength is increased to N 10,000.
Tanner Graph
A.k.a Factor Graph Notation
Factor Graphs
A factor graph shows how a function of several variables can be factored into a
product of "smaller" functions.
For example, the function g defined by g(x,y)=xy+x can be factored into
g(x,y)=f1(x)f2(y) where f1(x)=x and f2(y)=y+1.
The factor graph depicting this factorization:
Graph for function g(x,y,z) = f1(x,y) f2(y,z) f3(x,z).
Why Factor graphs?
1. Very general: variables and functions are arbitrary
2. Factorization => Sum-Product Algorithm can be applied
3. Third, many efficient algorithms are special cases of the Sum-Product Algorithm
applied to factor graphs:
FFT (Fast Fourier Transform), Viterbi Algorithm, Forward-Backward
Algorithm, Kalman Filter and Bayesian Network Belief Propagation.
Brings many good algorithms together in a common framework.
LDPC Coding Constructions
LDPC Decoding: Iterative
Regular vs Irregular LDPC Codes
Irregular LDPC Codes
Turbo Codes
Turbo Codes
Turbo Encoder
The encoder of a turbo code.
Each box C1, C2, contains a convolutional code.
The source bits are reordered using a permutation before they are fed to
C2.
The transmitted codeword is obtained by concatenating or interleaving the
outputs of the two convolutional codes.
The random permutation is chosen when the code is designed, and fixed
thereafter.
Turbo: MAP Decoding
Turbo Codes: Performance
UMTS Turbo Encoder
WiMAX: Convolutional Turbo Codes (CTC)
Digital Fountain Erasure Codes
What is a Digital Fountain?
A digital fountain is an ideal/paradigm for data
transmission.
Vs. the standard (TCP) paradigm: data is an
ordered finite sequence of bytes.
Instead, with a digital fountain, a k symbol file yields
an infinite data stream (fountain); once you have
received any k symbols from this stream, you can
quickly reconstruct the original file.
How Do We Build a Digital Fountain?
We can construct (approximate) digital fountains using erasure
codes.
Including Reed-Solomon, Tornado, LT, fountain codes.
Generally, we only come close to the ideal of the paradigm.
Streams not truly infinite; encoding or decoding times;
coding overhead.
Forward Error Correction (FEC):
Eg: Reed-Solomon RS(N,K)
>= K of N
received
RS(N,K)
Recover K
data packets!
FEC (N-K)
Block
Size
(N)
Lossy Network
High Encode/Decode times: O{K(N-K) log2 N}.
Hard to do @ very fast line rates (eg: 1Gbps+).
Data = K
Digital Fountain Codes (Eg: Raptor codes)
Rateless: No Block Size !
Fountain of encoded pkts
Compute on demand!
>= K+
received
Recover K
data packets!
Data = K
Low Encode/Decode times: O{K ln(K/)}
Lossy Network
w/ probability 1- . Overhead ~ 5%.
Can be done by software & @ very fast (eg: 1Gbps+).
Raptor/Rateless Codes
Properties: Approximately MDS
Infinite supply of packets possible.
Need k(1+e) symbols to decode, for some e > 0.
Decoding time proportional to k ln (1/e).
On average, ln (1/e) (constant) time to produce an encoding
symbol.
Key: Very fast encode/decode time compared to RS codes
Compute new check packets on demand!
Bottomline: these codes can be made very efficient and deliver
on the promise of the digital fountain paradigm.
Digital Fountain Encoder/Decoder
Encoder:
Decoder:
Digital Fountain decoding (example)
Received bits: 1011
t1
is of degree 1, s1 = t1 = 1
& t3 XORed w/ s1 = 1.
Remove s1s edges
t2
s2 set to t4 = 0 {degree = 1}
Repeat as before; s3 = 1
First such code
called Tornado code.
Later: LT-codes;
Concatenated version:
Raptor code
Esoterics: Robust Soliton Degree Distribution
Applications: Reliable Multicast
Many potential problems when multicasting to large audience.
Feedback explosion of lost packets.
Start time heterogeneity.
Loss/bandwidth heterogeneity.
A digital fountain solves these problems.
Each user gets what they can, and stops when they have enough: doesnt
matter which packets theyve lost
Different paths could have diff. loss rates
Applications: Downloading in Parallel
Can collect data from multiple digital fountains for the same
source seamlessly.
Since each fountain has an infinite collection of packets, no
duplicates.
Relative fountain speeds unimportant; just need to get enough.
Combined multicast/multi-gather possible.
Can be used for BitTorrent-like applications.
Microsofts Avalanche product uses randomized linear codes to
do network coding
[Link]
Used to deliver patches to security flaws rapidly; Microsoft Update
dissemination etc
Single path: limited capacity, delay, loss
High Delay/Jitter
Low
Capacity
Lossy
Network paths usually have:
low e2e capacity,
high latencies and
high/variable loss rates.
Time
Idea: Aggregate Capacity, Use Route Diversity!
Low Perceived
Loss
High Perceived
Capacity
Scalable Performance
Boost Delay/Jitter
with
Low Perceived
Paths
Multi-path LT-TCP (ML-TCP): Structure
Socket
Buffer
Map pktspaths intelligently
based upon Rank(pi, RTTi, wi)
Reliability @ aggregate, across paths
(FEC block = weighted sum of windows,
PFEC based upon weighted average loss rate)
Per-path congestion control
(like TCP)
Note: these ideas can be applied to other link-level multi-homing,
Network-level virtual paths, non-TCP transport protocols (including video-streaming)
Summary
Coding: allows better use of degrees of freedom
Greater reliability (BER) for a given Eb/No, or
Coding gain (power gain) for a given BER.
Eg: @ BER = 10-5:
5.1 dB (Convolutional), 7.1dB (concatenated RS/Convolutional)
Near (0.1-1dB from) Shannon limit (LDPC, Turbo Codes)
Magic achieved through iterative decoding (belief propagation) in both
LDPC/Turbo codes
Concatenation, interleaving used in turbo codes
Digital fountain erasure codes use randomized LDPC constructions as
well.
Coding can be combined with modulation adaptively in response to SNR
feedback
Coding can also be combined with ARQ to form Hybrid ARQ/FEC
Efficient coding schemes now possible in software/high line rates => they
are influencing protocol design at higher layers also:
LT-TCP, ML-TCP, multicast, storage (RAID, CD/DVDs), Bittorrent,
Network coding in Avalanche (Microsoft Updates) etc