1. Background and Motivation
Massive image databases are becoming increasingly com-mon. Examples include document image databases such as declassified government documents [8] and photo archives such as the New York Times archive. Duplicate removal offers space and bandwidth savings, and more user-friendly search results. Despite some effort to cull duplicates. the image search service of Google [4] often retrieves a number of duplicate and near-duplicate images. Duplicate detection also finds application in copyright protection and authenti-cation.