Channel Capacity

Equations

A brief summary of the setup and very important result from infomration theory

Author

J. Nicholas Laneman

Published

February 22, 2025

Information theory [1] [2] provides fundamental understanding of the limits of communication. Consider the following block diagram.

flowchart LR
    Source -- "$$W_n$$" --> Encoder
    Encoder -- "$$X^n$$" --> Channel
    Channel -- "$$Y^n$$" --> Decoder
    Decoder -- "$$\hat{W}_n$$" --> Sink

Here we have a message \(W_n\) consisting of \(nR\) bits encoded into a codeword \(X^n\) of length \(n\) channel uses. The rate \(R\) is measured in units of bits per channel use. The channel is modeled by a probability law \(P_{Y^n|X^n}(y^n|x^n)\), so that the output \(Y^n\) is a potentially noisy version of the input \(X^n\). The decoder extracts the message estimate \(\hat{W}_n\) from the channel output \(Y^n\).

One of the fundamental results in information theory is that it is possible to transmit information reliably over a channel in the sense that average probability of error becomes arbitrarily small the longer the encoding and decoding we allow. Mathematically, we say that a communication rate \(R\) is achievable if there exists a sequence of encoders and decoders with \(\mathbb{P}[W_n \neq \hat{W}_n] \rightarrow 0\) as \(n \rightarrow \infty\). The channel capacity \(C\) is defined as the least upper bound, or supremum, of the achievable rates.

In addition, we can in principle determine the value of the channel capacity for a given communication channel. For a fairly wide class of channels, we have

\[C = \lim_{n\rightarrow\infty} \sup_{P_X^n(x^n)} \frac{1}{n}\mathbb{I}(X^n;Y^n),\]

where the the mutual information is defined as

\[\mathbb{I}(X^n;Y^n) := \sum_{x^n,y^n} P_{X^n,Y^n}(x^n,y^n) \log \frac{P_{X^n,Y^n}(x^n,y^n)}{P_{X^n}(x^n)P_{Y^n}(y^n)}.\]

Two special cases:

The memoryless case in which the probably law factors as \[P_{Y^n|X^n}(y^n|x^n) = \prod_{k=1}^{n} P_{Y|X}(y_k|x_k), \quad k=1,2,\ldots,n,\] and the channel capacity simplifies to \[C = \sup_{P_X(x)} \mathbb{I}(X;Y).\]
The Gaussian memoryless case in which \(P_{Y|X}(y|x)\) is zero-mean Gaussian with variance \(\sigma^2\), the input has an average power constraint \(\sum_{k=1}^{n}X_k \le nP\), and the channel capacity becomes \[C = \frac{1}{2} \log_2\left(1 + \frac{P}{\sigma^2} \right).\]

References

[1]

T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: John Wiley & Sons, Inc., 1991.

[2]

R. G. Gallager, Information Theory and Reliable Communication. New York: John Wiley & Sons, Inc., 1968.