Introduction
Let be an image, and let independent identically distributed (i.i.d.) -dimensional feature vectors be extracted from this image. Examples of such a feature vector are the position and orientation of a randomly chosen edge, a vector of samples in a textured region, or the output vector of a spatial prediction filter. Such features can be used for registering two images to each other, texture classification and segmentation, or con-tent-based image retrieval. The basic objective of these applications can be reduced to assessing characteristics of the distribution of the feature vectors. For example, the mutual information method of image registration [53] searches through a number of coordinate transformations to find the one that minimizes the entropy of the joint feature distribution of the two images. Similarly, many image retrieval algorithms search through a database of images to find the homologous image whose feature distribution is closest to that of the query image where closeness is measured in terms of minimum information divergence [50], [47], [11]. This article discusses minimal graph methods for estimating entropy and divergence measures associated with a set of feature vectors. Specifically, we focus on a class of graphs which span the set of feature vectors and as a byproduct produces a consistent estimator of feature entropy and divergence. Wecall such graphs entropic spanning graphs.