Abstract:
Today, biomedical media data are being generated at rates unimaginable only years ago. Content-based retrieval of biomedical media from large databases is becoming increa...Show MoreMetadata
Abstract:
Today, biomedical media data are being generated at rates unimaginable only years ago. Content-based retrieval of biomedical media from large databases is becoming increasingly important to clinical, research, and educational communities. In this paper, we present the recently developed entropy balanced statistical (EBS) k-d tree and its applications to biomedical media, including a high-resolution computed tomography (HRCT) lung image database and the first real-time protein tertiary structure search engine. Our index utilizes statistical properties inherent in large-scale biomedical media databases for efficient and accurate searches. By applying concepts from pattern recognition and information theory, the EBS k-d tree is built through top-down decision tree induction. Experimentation shows similarity searches against a protein structure database of 53 363 structures consistently execute in less than 8.14 ms for the top 100 most similar structures. Additionally, we have shown improved retrieval precision over adaptive and statistical k-d trees. Retrieval precision of the EBS k-d tree is 81.6% for content-based retrieval of HRCT lung images and 94.9% at 10% recall for protein structure similarity search. The EBS k-d tree has enormous potential for use in biomedical applications embedded with ground-truth knowledge and multidimensional signatures
Published in: IEEE Transactions on Information Technology in Biomedicine ( Volume: 11, Issue: 3, May 2007)