Huffman Coding
A Huffman code is a prefix code constructed via the following procedure:
- Binary: Create a prefix tree by merging the 2 symbols \(x,y\in \mathcal{X}\) with the lowest probability masses towards the root.
- \(D\) -ary: Merge \(D\) symbols with the smallest probability masses to form an internal node. If this is done in \(k+1\) steps, there will be \(k+1\) internal nodes and \(D+k(D-1)\) leaves where each leaf corresponds to a source symbol in the alphabet. If the alphabet size is of the form \(D+k(D-1)\) apply the procedure directly, otherwise add dummy symbols with 0 probability mass to the alphabet until the alphabet size is \(D+k(D-1)\).