Conferences >2006 IEEE Computer Society Co...

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper presents a method for recognizing scene categories based on approximate global geometric correspondence. This technique works by partitioning the image into in...Show More

Metadata

Abstract:

This paper presents a method for recognizing scene categories based on approximate global geometric correspondence. This technique works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting "spatial pyramid" is a simple and computationally efficient extension of an orderless bag-of-features image representation, and it shows significantly improved performance on challenging scene categorization tasks. Specifically, our proposed method exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories. The spatial pyramid framework also offers insights into the success of several recently proposed image descriptions, including Torralba’s "gist" and Lowe’s SIFT descriptors.

Published in: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)

Date of Conference: 17-22 June 2006

Date Added to IEEE Xplore: 09 October 2006

Print ISBN:0-7695-2597-0

Print ISSN: 1063-6919

DOI: 10.1109/CVPR.2006.68

Conference Location: New York, NY, USA

No metrics found for this document.

Contents

1. Introduction

In this paper, we consider the problem of recognizing the semantic category of an image. For example, we may want to classify a photograph as depicting a scene (forest, street, office, etc.) or as containing a certain object of interest. For such whole-image categorization tasks, bag-of-features methods, which represent an image as an orderless collection of local features, have recently demonstrated impressive levels of performance [7], [22], [23], [25]. However, because these methods disregard all information about the spatial layout of the features, they have severely limited descriptive ability. In particular, they are incapable of capturing shape or of segmenting an object from its background. Unfortunately, overcoming these limitations to build effective structural object descriptions has proven to be quite challenging, especially when the recognition system must be made to work in the presence of heavy clutter, occlusion, or large viewpoint changes. Approaches based on generative part models [3], [5] and geometric correspondence search [1], [11] achieve robustness at significant computational expense. A more efficient approach is to augment a basic bag-of-features representation with pairwise relations between neighboring local features, but existing implementations of this idea [11], [17] have yielded inconclusive results. One other strategy for increasing robustness to geometric deformations is to increase the level of invariance of local features (e.g., by using affine-invariant detectors), but a recent large-scale evaluation [25] suggests that this strategy usually does not pay off.

Usage

Select a Year

View as

Total usage sinceJan 2011:23,637

Year Total:250

Data is updated monthly. Usage includes PDF downloads and HTML views.

Citations

4,294

Crossref^®

Search for
Citations in
Google Scholar^®

References is not available for this document.

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Abstract:

Metadata

Abstract:

1. Introduction

View as

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

View as

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?