We now turn to *limitations* of codes, in the form *upper bounds* on the rate of codes as a function of their relative distance. We will typically give concrete bounds on the size of codes, and then infer as corollaries the asymptotic statement for code families relating rate and relative distance. All the bounds apply for general codes and they do not take advantage of linearity. However, for the most sophisticated of our bounds, the linear programming bound, which we discuss in the next set of notes, we will present the proof only for linear codes as it is simpler in this case.

We recall the two bounds we have already seen. The Gilbert-Varshamov bound asserted the existence of (linear) -ary codes of block length , distance at least , and size at least . Or in asymptotic form, the existence of codes of rate approaching and relative distance . The Hamming or sphere-packing bound gave an upper bound on the size (or rate) of codes, which is our focus in these notes. The Hamming bound says that a -ary code of block length and distance can have at most codewords. Or in asymptotic form, a -ary code of relative distance can have rate at most .

As remarked in our discussion on Shannon’s theorem for the binary symmetric channel, if the Hamming bound could be attained, that would imply that we can correct *all* (as opposed to most) error patterns of weight with rate approaching . Recall that there are perfect codes (such as the Hamming codes) that meet the Hamming bound. However, these codes have very small distance ( in the case of Hamming codes). A generalization of Hamming codes called binary BCH codes (the acronym stands for the code’s independent inventors Hocquenghem (1959) and Bose and Ray-Chaudhuri (1960)) show that when is a fixed constant and the block length is allowed to grow, the Hamming bound is again tight up to lesser order terms. However, we will improve upon the Hamming bound and show that its asymptotic form (for any relative distance bounded away from zero) cannot be attained for any fixed alphabet. The proof method has some connections to *list decoding*, which will be an important focus topic later in the course.

**1. Singleton bound **

We begin with the simplest of the bounds:

Theorem 1Let be a code of block length and minimum distance over an alphabet of size . Then .

*Proof:* Suppose not, and . By the pigeonhole principle there must be two codewords , that agree on the first locations. But then , contradicting the hypothesis that has minimum distance .

This gives an alphabet-independent asymptotic upper bound on the rate as a function of relative distance.

Corollary 2The rate and relative distance of a code satisfy .

Though really simple, the Singleton bound is tight in general — we will later see an algebraic family of codes called Reed-Solomon codes which achieve the Singleton bound and have dimension and minimum distance . The family of codes which meet the Singleton bound are called *maximum distance separable* (MDS) codes.

However, Reed-Solomon and other MDS codes will be (necessarily) defined over an alphabet that grows with the block length. For code families over a fixed alphabet such as binary codes, substantial improvements to the Singleton bound are possible. We turn to such bounds next.

**2. The Plotkin bound **

The Gilbert-Varshamov bound asserts the existence of positive rate binary codes only for relative distance . The Hamming bound on the other hand does not rule out positive rate binary codes even for , in fact not even for any . Thus there is a qualitative gap between these bounds in terms of identifying the largest possible distance for asymptotically good binary codes. We now prove an upper bound which shows that the relative distance has to be at most (and thus the Hamming bound is quite weak for large ) unless the code has very few codewords (and in particular has vanishing rate).

While the same underlying ideas are involved, the proofs are simpler to present for the binary case, so we will focus on binary codes. We will state the bound for the -ary case and leave the details as an exercise. Our proofs reduce the task of bounding the size of the code to bounding the number of pairwise far-apart unit vectors in Euclidean space, and then use a geometric argument for the latter task.

We first state the geometric lemma we need.

Lemma 3Let be unit vectors in .

Suppose for all . Then .Suppose for all . Then .

*Proof:* We only prove the first part, and leave the second as an (interesting) exercise. Note that bound of is best possible, as we can take orthonormal vectors and their negations. For the first part, we have

which gives .

Using the above, we can prove that a binary code of block length and distance cannot have too many codewords.

Theorem 4Let be a binary code of block length and distance .

If , then .If , then .

*Proof:* Let and let be the codewords of . By hypothesis for . We will map the codewords into unit vectors , , such that the angle between every pair of vectors is at least degrees (i.e., their dot product ). These vectors are defined as follows:

where is the ‘th bit of the codeword . It is easy to check that

When , these dot products are non-positive, and by the second part of Lemma 3, we can bound .

For the first part, if , then , and therefore by the first part of Lemma 3, we can bound

The above shows that a code family of relative distance can have at most codewords. Thus a code family cannot have relative distance strictly bounded away from with a number of codewords that is growing with the block length. In particular, such code families have zero rate. We now prove that this is also the case if the relative distance is .

We now use a “puncturing” argument, which implies that the complement of the feasible rate vs relative distance region is convex, to derive an upper bound of rate for relative distances .

Theorem 5If a binary code has block length and distance , then .

*Proof:* Let and . For each , define the subcode to be the subcode of consisting of all codewords which have in the first positions, projected on . Formally, for . Each is a binary code of block length . Note that since has distance at least , so does . By Theorem 4, . Of course, . So we conclude .

We record the asymptotic implication of the above upper bound as:

Corollary 6The rate of a binary code of relative distance must satisfy .

The above arguments can be extended to the -ary case. The idea is to map symbols to unit vectors whose pairwise dot product is exactly .

Exercise 1Let be a -ary code of block length and minimum distance at least .

If , then .When , .

Deduce that the of a -ary code of relative distance must satisfy .

Here is a plot of the Gilbert-Varshamov lower bound on rate, and the Hamming and Plotkin upper bounds on rate, for binary codes. On the horizontal axis is the relative distance and the vertical axis is the rate . Any point under the leftmost curve (the Gilbert-Varshamov bound) is achievable, and any point above either the Plotkin bound (the straight line) or the Hamming bound is not achievable. Notice that the Hamming bound is stronger than the Plotkin bound for low distances (or high rates). We now proceed to a bound that improves both the Hamming and Plotkin bounds.

**3. Johnson and Elias-Bassalygo bounds **

Recall that denotes the size of the largest -ary code of block length and distance . We denote by the size of a largest *constant weight* code of block length and distance all of whose codewords have Hamming weight . We also denote by the largest size of a code of block length and distance all of whose codewords have Hamming weight *at most* . For the case , we just denote these quantities by , , and respectively.

On your problem set, you are asked to prove the following (just for the binary case).

The above gives a method to upper bound the size of unrestricted codes via upper bounds on the size of constant weight codes. Note that when , , so the Hamming like upper bound is a special case of this. In general the larger the as a function of for which we can prove a good upper bound on (as either a constant or a polynomial function of ), the better the upper bound on .

We will now prove such an upper bound, in fact for the more general quantity . Such a bound, called the *Johnson bound*, is intimately connected to *list decoding*. Proving that is small, say at most which is either a constant or bounded by , implies that for *every* -ary code of block length and distance , every Hamming ball of radius contains at most codewords of . In other words, if a codeword is transmitted and at most errors corrupt it, then one can *list decode* a small list of at most candidate codewords one of which must equal the original codeword. Of course, for , we have , and the key here is that one can have and still ensure a small worst-case *list size* .

Once we prove the Johnson bound, we will deduce our desired bound on , called the Elias-Bassalygo bound after their inventors, by appealing to Exercise 2. Our proofs of the Johnson bound will be geometric in nature, relying on Lemma 3. We prove the bounds for binary codes, and leave the extension to larger alphabets as exercises.

** 3.1. Binary codes **

Theorem 7 (Binary Johnson bound)For integers , if , then . (We will often refer to the quantity as , the (binary) Johnson radius.)

*Proof:* Let be a code such that for , and for each . We will map the codewords into vectors similarly to the proof of the Plotkin bound (Theorem 4), except we won’t normalize them to unit vectors here:

where is the ‘th bit of the codeword . Likewise the all ‘s vector is mapped to the vector

Let be a parameter to be picked later. The parameter will be picked so that all the pairwise dot products are nonpositive for . Now

The latter quantity is at most provided

The choice maximizes the right hand side, and leads to the requirement

which is met by hypothesis about . Therefore, for this choice of , for . are nonpositive for . Appealing to (the second part of) Lemma 3, we can conclude , and thus .

Remark 1It is possible to improve the upper bound on for from to by noting that for the choice of parameters for each . This together with the nonpositive pairwise dot products can be used to improve the geometric upper bound on the number of vectors from to .

For slightly less than the Johnson radius , one can sharpen the upper bound on to a constant independent of .

Exercise 3Prove that when , .

(Hint: Follow the above proof, and pick parameters so that the first part of Lemma 3 can be used.)

Together with Exercise 2, we thus have the following upper bound, called the Elias-Bassalygo bound, on the size (rate) of codes of certain distance (relative distance).

Theorem 8For integers ,

where .

Thus a binary code family of relative distance has rate at most

** 3.2. Statement for larger alphabets **

As with the Plotkin bound, we leave the extension of the Johnson and Elias-Bassalygo bounds to larger alphabets exercises. The hint is to map symbols into appropriate vectors in so that large distance between codewords translates into small dot product between their associated vectors.

Exercise 4For all integers and , provided

Further, if

then .

Together with Exercise 2, this gives the Elias-Bassalygo upper bound for -ary codes:

Theorem 9A -ary code family of relative distance has rate at most

** 3.3. Alphabet oblivious bound for Johnson radius **

When discussing list decoding of codes such as Reed-Solomon codes which are defined over an alphabet size that grows with the block length, it will be useful to have the following “alphabet oblivious” version of the Johnson bound. (This version is also a simpler and often good enough approximation to the -ary Johnson radius when is somewhat large.)

Theorem 10Let be a code of block length and distance . Then the following hold:

Every Hamming ball of radius at most

in has at most codewords of .Every Hamming ball of radius at most in has at most codewords of .

The above statement follows from Exercise 4 by verifying that for every and ,

We conclude with a plot that adds the Elias-Bassalygo upper bound to the earlier plot. Note that this bound improves on both the Hamming and Plotkin bounds, but for small distances the difference between the Hamming the Elias-Bassalygo bounds is small.

## Leave a Reply