Abstract:
A practical suboptimal (variable source coding) algorithm for lossy data compression is presented. This scheme is based on approximate string matching, and it naturally e...Show MoreMetadata
Abstract:
A practical suboptimal (variable source coding) algorithm for lossy data compression is presented. This scheme is based on approximate string matching, and it naturally extends the lossless Lempel-Ziv (1977) data compression scheme. Among others we consider the typical length of an approximately repeated pattern within the first n positions of a stationary mixing sequence where D percent of mismatches is allowed. We prove that there exists a constant r/sub 0/(D) such that the length of such an approximately repeated pattern converges in probability to 1/r/sub 0/(D) log n (pr.) but it almost surely oscillates between 1/r/sub -/spl infin//(D) log n and 2/r/sub 1/(D) log n, where r/sub -/spl infin//(D)>r/sub 0/(D)>r/sub 1/(D)/2 are some constants. These constants are natural generalizations of Renyi entropies to the lossy environment. More importantly, we show that the compression ratio of a lossy data compression scheme based on such an approximate pattern matching is asymptotically equal to r/sub 0/(D). We also establish the asymptotic behavior of the so-called approximate waiting time N/sub l/ which is defined as the time until a pattern of length C repeats approximately for the first time. We prove that log N/sub l//l/spl rarr/r/sub 0/(D) (pr.) as l/spl rarr//spl infin/. In general, r/sub 0/(D)>R(D) where R(D) is the rate distortion function. Thus for stationary mixing sequences we settle in the negative the problem investigated by Steinberg and Gutman by showing that a lossy extension of the Wyner-Ziv (1989) scheme cannot be optimal.
Published in: IEEE Transactions on Information Theory ( Volume: 43, Issue: 5, September 1997)
DOI: 10.1109/18.623143