I. Introduction
Let be a discrete memoryless source defined on a finite or countably infinite alphabet . Without loss of generality, we assume that , where can be infinite, and that the probabilities , are ordered so that . The th extension of the source , denoted , is characterized by the probability distribution for all . For a given code for , the expected codeword length for is given by L(C)=\sum_{(i_1,i_2,\ldots,i_n)\in{\cal X}^n}p(i_1,i_2,\ldots,i_n) l(i_1,i_2,\ldots,i_n)
where is the length of the codeword . Let be the expected codeword length per source symbol of an optimal prefix code for the extended source , namely \eqalignno{L_n(X)&=\min\left\{{1 \over n}L(C): C\ {\hbox{is\ a\ prefix\ code\ for}}\ X^n \right\}. \qquad &\hbox{(1)} }
It is well known from Shannon's noiseless coding theorem [1], [2] that H(X) \leq L_n(X) < H(X)+{1 \over n} \eqno{\hbox{(2)}}
where is the entropy of the source .