Introduction to coding theory (CMU, Spr 2010)

April 30, 2010

Lecture 26 summary

Filed under: Lecture summary — Venkat Guruswami @ 5:17 pm

We mentioned two topics which were introduced to coding theory by theoretical computer science: local testing and local decoding of codes. These and related topics (such as PCPs and applications of locally decodable codes in complexity and cryptography) have been intensively researched in the last 10-15 years, with several breakthroughs occurring in recent years.

We focused on local (unique) decoding of codes for the lecture. We saw how Hadamard codes can be locally decoded using just two queries. However, their encoding length for a message of length n is 2^n. We then saw the higher degree generalization of Hadamard codes, where the message is interpreted as a degree D homogeneous multilinear polynomial (i.e., all terms have degree exactly D \ge 2). This gave us codes of encoding length \approx 2^{O(n^{1/D})}, and we discussed a 2D-query local decoding algorithm. This was based on interpolating the restriction of the multilinear polynomial on a line in a random direction. Thus for any constant q, we got codes that are locally decodable using q queries that have encoding length 2^{n^O(1/q)}.

We then turned to the ingenious 3-query locally decodable code (LDC) construction due to Yekhanin. In keeping with the theme of our initial constructions, we presented a polynomial view of these codes, where the messages are again interpreted as homegeneous multilinear polynomials of certain degree (say D) but only a carefully chosen subset of all possible {M \choose D} monomials are allowed. (This actually reduces the rate compared to our earlier construction, but the big gain is that one is able to locally decode using only three queries instead of about D queries!) Our description is based on a variant of Yekhanin’s construction that was discovered by Raghavendra and subsequently presented by Gopalan as polynomial based codes.

For every t such that 2^t-1 is prime (such a prime is called a Mersenne prime), we gave a construction of 3-query LDCs  of encoding length \exp(O_t(n^{1/t})). Since very large Mersenne primes are known, we get 3-query LDCs of encoding length less than \exp(O(n^{10^{-7}})). We presented a 3-query algorithm and proved its correctness assuming the stated properties of the “matching sets” U_i,V_i used in the construction, and then explained how to construct families of such subsets of \{1,2,\dots,M\} of size \Omega_t(M^t).

April 28, 2010

Lecture 25 summary

Filed under: Lecture summary — Venkat Guruswami @ 8:51 pm

We discussed irregular LDPC codes, and characterized their rate and erasure correction capability (via the message passing algorithm discussed in the previous lecture) in terms of the degree distribution of the edges. Specifically, let \lambda_i (resp. \rho_i) is the fraction of edges incident on degree i variable (resp. check) nodes, an define the generating functions \lambda(z) = \sum_{i=1}^{d_v^{\max}} \lambda_i z^{i-1} and \rho(z) = \sum_{i=1}^{d_c^{\max}} \rho_i z^{i-1}. Then the rate of the LDPC code is given by

\displaystyle 1 -\frac{\int_0^1 \rho(z) \ dz}{\int_0^1 \lambda(z) \ dz} \ .

Also if \alpha \lambda(1-\rho(1-x)) \le x for every x, 0 \le x \le 1, and some constant \alpha > 0, we argued why the message passing algorithm succeeds with high probability on \mathrm{BEC}_\alpha' for any constant \alpha' < \alpha.

We then argued how the distributions

\displaystyle \lambda(z) = \frac{1}{H(D-1)} \sum_{i=1}^{D-1} \frac{z^i}{i}

and

\displaystyle \rho(z) = \exp \left( \frac{H(D-1)}{\alpha} (z-1) \right)

(perhaps truncated to a finite series) enables achieving capacity of \mathrm{BEC}_{\alpha'} — we can achieve a rate 1-\alpha'-\epsilon with decoding complexity O(n \log (1/\epsilon) (since the average variable node degree is \approx H(D-1)).

This result is from the paper Efficient erasure correcting codes. Further details, including extensions to BSC and AWGN channels, and the martingale argument for the concentration of the performance around that of the average code in the ensemble, can be found in the paper The capacity of low-density parity-check codes under message-passing decoding.

The last quarter of the lecture was devoted a recap of the main topics covered in the course.

April 24, 2010

Lecture 24 summary

Filed under: Lecture summary — Venkat Guruswami @ 10:08 pm

We discussed the message passing algorithm for decoding LDPC codes based on (d_v,d_c)-regular graphs on the binary erasure channel, and derived an expression for threshold erasure probability for which the algorithm guarantees vanishing bit error probability. We then turned to the binary symmetric channel, and discussed Gallager’s “Algorithm A” and derived the recurrence equation for the decay of the bit error probability. We briefly discussed Gallager’s “Algorithm B” as well, where a variable node flips its value if more than a certain cut-off number (typically majority after a few iterations) of its neighboring check nodes suggest that the node flips its value. We mentioned the values of the threshold crossover probability for some small values of d_v and d_c.

During lecture, the question of the speed of convergence of the bit error probability (BER) to zero was asked. The answer I guessed turns out to be correct: if we run the algorithm for \Omega(\log n) iterations which is smaller than the girth of the graph, for Algorithm A the BER is at most 1/n^{\beta} for some \beta > 0, and for Algorithm B for d_v > 3 with an optimized cut-off for flipping, the BER is at most 2^{-n^{\gamma}} for some \gamma > 0.

We do not plan to have notes for this segment of the course.  I can, however, point you to an introductory survey I wrote (upon which the lectures are loosely based), or Gallager’s remarkable Ph.D. thesis which can be downloaded here (the decoding algorithms we covered are discussed in Chapter 4). A thorough treatment of the latest developments in the subject of iterative and belief propagation decoding algorithms can be found in Richardson and Urbanke’s comprehensive book Modern Coding Theory.

April 21, 2010

Lecture 23 summary

Filed under: Lecture summary — Venkat Guruswami @ 3:29 pm

We completed the discussion of the rate vs. list decoding radius trade-off achieved by folded Reed-Solomon codes and multivariate interpolation based decoding, and discussed its complexity and list-size bounds, as well as alphabet size. We highlighted the powerful list recovery property offered by folded RS codes, where having up to \ell possible choices for each codeword position does not affect the ability to correct with agreement R + \epsilon (where R is the rate), and we can “absorb” the effect of \ell into a somewhat larger alphabet size and decoding complexity. This feature is invaluable in using folded RS codes as outer codes in concatenation schemes, as we saw in two results:

  1. Binary codes which are list-decodable up to the Zyablov radius (earlier we saw to unique decode up to half the Zyablov radius using GMD decoding)
  2. Construction of codes of rate R over an alphabet of size \exp((1/\epsilon)^{O(1)}) that are list-decodable up to a fraction 1-R-\epsilon of errors. The alphabet size is not far from the optimal bound of \exp(1/\epsilon), and nicely combines ideas from the algebraic coding and expander decoding parts of the course.

We then wrapped up our discussion of list decoding by mentioning some of the big questions that still remain open, especially in constructing binary codes with near-optimal (or even better than currently known) trade-offs.

We discussed the framework of message-passing algorithms for LDPC codes, which will be the subject of the next lecture or two. We will mostly follow the description in this survey, but will not get too deep into the material.

April 14, 2010

Lecture 22 summary

Filed under: Lecture summary — Venkat Guruswami @ 2:59 pm

We discussed how folded Reed-Solomon codes can be used to approach the optimal trade-off between rate and list decoding radius, specifically list decoding in polynomial time from a fraction 1-R-\epsilon of errors with rate R for any desired constant \epsilon > 0.

We presented an algorithm for list decoding folded Reed-Solomon codes (with folding parameter s) when the agreement fraction is more than \frac{1}{s+1} + \frac{s^2 R}{s+1}.  This was based on the extension of the Welch-Berlekamp algorithm to higher order interpolation (in s+1 variables). Unfortunately, this result falls well short of our desired target, and in particular is meaningless for R > 1/s.

We then saw how to run the (s+1)-variate algorithm on a folded RS code with folding parameter m > s, to list decode when the agreement fraction is more than \frac{1}{s+1} + \frac{s}{s+1} \frac{m}{m-s+1} R. Picking s large and m \gg s, say s \approx 1/\epsilon and m \approx 1/\epsilon^2, then enables list decoding from agreement fraction R+\epsilon. We will revisit this final statement briefly at the beginning of the next lecture, and also comment on the complexity of the algorithm, bound on list-size, and alphabet size of the codes.

Notes for this lecture may not be immediately available, but you can refer to the original paper Explicit codes achieving list decoding capacity: Error-correction with optimal redundancy or Chapter 6 of the survey Algorithmic results for list decoding.  Both of these are tailored to list decode even from the (in general) smaller agreement fraction \left(\frac{mR}{m-s+1}\right)^{s/(s+1)} and use higher degrees for the Z_i‘s in the polynomial Q(X,Z_1,\ldots,Z_s) as well as multiple zeroes at the interpolation points. In the lecture, however, we were content, for sake of simplicity and because it suffices to approach agreement fraction R + \epsilon, with restricting Q to be linear in the Z_i‘s.

A reminder that we will have NO lecture this Friday (April 16) due to Spring Carnival.

April 9, 2010

Lecture 21 summary

Filed under: Lecture summary — Venkat Guruswami @ 2:54 pm

Today we completed the description and analysis of the multiplicities based weighted polynomial reconstruction algorithm which immediately yielded an algorithm for list decoding Reed-Solomon codes up to the Johnson radius of 1-\sqrt{R} for rate R. We discussed the utility of weights in exploiting “soft” information available during decoding (eg. from decoding inner codes in a concatenation scheme, or from a demodulator which “rounds” analog signals to digital values). We saw simple consequences for list decoding binary concatenated codes, and in particular how to list-decode from a fraction (1/2-\gamma) of errors with \Omega(\gamma^6) rate and list-size O(1/\gamma^3). While the rate is positive for every \gamma > 0, it is far from the optimal \gamma^2 bound. (We will soon see how this can be improved substantially by using codes with more powerful list-decoding properties than Reed-Solomon codes at the outer level.) Finally we defined (a version of) folded Reed-Solomon codes (we will give a list decoding algorithm for these next lecture).

We will have notes for this week’s lecture available soon, but the material covered this week has also been written about in several surveys on list decoding (some of which are listed on the course webpage). Here are a couple of pointers, which also discuss the details of list decoding folded RS codes which we will cover next week (though we will use a somewhat simpler presentation with weaker bounds in our lectures):

April 7, 2010

Lecture 20 summary

Filed under: Lecture summary — Venkat Guruswami @ 6:46 pm

We touched upon some aspects relating to the combinatorics of list decoding Reed-Solomon codes. We discussed the geometric intuition underlying Reed-Solomon decoding, and presented Sudan’s algorithm for polynomial reconstruction/list decoding RS codes. We analyzed it by optimizing individual degrees leading to an algorithm working for agreement parameter t > 2 \sqrt{kn} where k is the degree and n is the number of points. By optimizing the (1,k)-weighted degree, we improved this to t \ge \sqrt{2kn}.

We motivated the method of using multiple zeroes in the interpolation with an example, and then presented the details of how to enforce multiple (say w) zeroes at a point for a bivariate polynomial, why it buys us w zeroes for each point of agreement, and why this should improve the agreement t needed by a further \sqrt{2} factor thus matching the Johnson bound.

April 2, 2010

Lecture 19 summary

Filed under: Lecture summary — Venkat Guruswami @ 7:54 pm

We pinned down the “list decoding capacity” for q-ary codes as 1-h_q(p), and saw that for large q this implies codes of rate R list-decodable up to a fraction 1-R-\epsilon of errors. We recalled the Johnson bound and how it implies a way to argue about list decoding properties knowing only the distance of the code. For Reed-Solomon codes this suggests that list-decoding from a fraction 1-\sqrt{R} of errors should be possible. This trade-off between rate and list decoding radius is best possible if one only appeals to the Johnson bound.

We then began the algorithmic segment of list decoding, setting up the stage for list decoding Reed-Solomon codes. In particular, we gave an algorithm for a toy problem where given a string which is the “mixture” of two codewords encoding polynomials p_1 and p_2, we recover both p_1 and p_2.

March 31, 2010

Lecture 18 summary

Filed under: Lecture summary — Venkat Guruswami @ 3:07 pm

We discussed a method for distance amplification using expander graphs, and how the resulting codes can be decoded via a majority logic algorithm. We discussed how to get near-MDS codes of rate R that enable linear-time correction of a fraction (1-R-\epsilon)/2 of errors.

We motivated list decoding as a primitive to bridge between the Shannon random errors model and the Hamming worst-case error model. We stated and proved the existence of list-decodable codes that achieve a rate vs error-correction radius trade-off similar to the Shannon capacity for the q-ary symmetric channel. We commented on why the proof does not work as such for linear codes.

March 26, 2010

Lecture 17 summary

Filed under: Lecture summary — Venkat Guruswami @ 9:58 pm

We proved the distance property of Tanner codes based on (spectral) expanders using a simple application of the expander mixing lemma. We discussed an decoding algorithm for the codes (to correct up to a number of errors about 1/4 the bound on minimum distance), based on O(\log n) iterations of decoding the local codes. We saw a distance amplication technique using dispersers which yield codes of relative distance 1-\epsilon over an alphabet of size \exp(O(1/\epsilon)).

I’d like to make two clarifications about the lecture. The first one concerns the calculation in the analysis of the decoding algorithm where we argued that the set T_1 was a constant factor smaller than S_1. If we are content with ensuring that |T_1| \le \frac{|S_1|}{1+\epsilon}, then it suffices to take the degree d of the expander to be at least 3\lambda/\delta_0 (like I had originally intended to set; in particular the degree need not grow as 1/\epsilon).

Indeed, by the expander mixing lemma and using that |S_1| \le (1-\epsilon) \left( \frac{\delta_0}{2} - \frac{\lambda}{d} \right) n, we have, in the notation from the lecture,

\displaystyle \frac{\delta_0 d}{2} |T_1| \le (1-\epsilon) \left( \frac{\delta_0 d}{2} - \lambda \right) |T_1| + \lambda \frac{|S_1|+|T_1|}{2} \ ,

which upon rearranging yields

\displaystyle |T_1| \le \frac{\lambda}{\epsilon \delta_0 d + (1-2\epsilon) \lambda} |S_1| \le \frac{|S_1|}{1+\epsilon} \ .

(The last step follows if d \ge 3\lambda/\delta_0.)

The second clarification concerns the linear time implementation of the decoding algorithm (instead of the obvious O(n \log n) time implementation). The key insight to argue this is to observe that in each iteration, the only vertices (in the relevant side for that iteration) that need to be locally decoded are those that are adjacent to some vertex on the other side that had some neighboring edges flipped in the local decoding of the previous iteration. The latter set shrinks geometrically in size by an argument as above. Let us be somewhat more specific. After the first (left) iteration, for each v \in L, the local subvector y_{|\Gamma(v)} of the current vector y \in \{0,1\}^{E} belongs to the code C_0. Let T(y) \subset R be the set of right hand side vertices u for which y_{|\Gamma(u)} does not belong to C_0. Let z \in \{0,1\}^E be the vector after the running the right side decoding on y. Note that for each w \in L that is not a neighbor of  any vertex in T(y), its neighborhood is untouched by the decoding. This means that in the next iteration (left side decoding), all these vertices w need not be examined at all.

The algorithmic trick therefore is to keep track of the vertices whose local neighborhoods do not belong to C_0 in each round of decoding. In each iteration, we only perform local decoding at a subset of nodes D_i that was computed in the previous iteration. (This subset of left nodes gets initialized after the first two rounds of decoding as discussed above.) After performing local decoding at the nodes in D_i, we prepare the stage for the next round by computing D_{i+1} as the set of neighbors of all nodes in D_i whose local neighborhood did not belong to C_0 prior to decoding.

Next Page »

Create a free website or blog at WordPress.com.