Kraft Inequality

Let \(\mathcal{C}\) be a \(D\) -ary source code for a (source) random variable \(X\), i.e. a function that maps elements in its sample space \(\mathcal{X}\) to \(D^*\) which is the set of all finite length sequences from a \(D\) -ary code alphabet. Let \(l_1,\cdots,l_m\) be the lengths of the codewords. If \(\mathcal{C}\) is uniquely decodable, i.e. this mapping is injective, then the Kraft Inequality states that \[ \sum_{k=1}^m D^{-l_k}\leq 1 \]

This inequality important since it is equivalent to the existence of a \(D\) -ary prefix-free code with codeword lengths \(l_i\).

It is also connects distributions with lengths of (prefix) code trees - something Jorma Rissanen in his Festschrift mentioned was relevant for developing the Minimum Description Length (MDL) principle.