Loading [MathJax]/extensions/MathZoom.js
Geometric context from a single image | IEEE Conference Publication | IEEE Xplore

Geometric context from a single image


Abstract:

Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric ...Show More

Abstract:

Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes. Geometric classes describe the 3D orientation of an image region with respect to the camera. We provide a multiple-hypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label. These confidences can then be used to improve the performance of many other applications. We provide a thorough quantitative evaluation of our algorithm on a set of outdoor images and demonstrate its usefulness in two applications: object detection and automatic single-view reconstruction.
Date of Conference: 17-21 October 2005
Date Added to IEEE Xplore: 05 December 2005
Print ISBN:0-7695-2334-X

ISSN Information:

Conference Location: Beijing, China
References is not available for this document.

1. Introduction

How can object recognition, while seemingly effortless for humans, remain so excruciatingly difficult for computers? The reason appears to be that recognition is inherently a global process. From sparse, noisy, local measurements our brain manages to create a coherent visual experience. When we see a person at the street corner, the simple act of recognition is made possible not just by the pixels inside the person-shape (there are rarely enough of them!), but also by many other cues: the surface on which he is standing, the 3D perspective of the street, the orientation of the viewer, etc. In effect, our entire visual panorama acts as a global recognition gestalt.

Select All
1.
B. Bose and W. E. L. Grimson, "Improving object classification in far-field video," in Proc. CVPR, 2004.
2.
P. Carbonetto, N. de Freitas, and K. Barnard, "A statistical model for general contextual object recognition," in Proc. ECCV, 2004.
3.
M. Collins, R. Schapire, and Y. Singer, "Logistic regression, adaboost and bregman distances," Machine Learning, vol. 48, no. 1-3, 2002.
4.
A. Criminisi, I. Reid, and A. Zisserman, "Single view metrology," IJCV, vol. 40, no. 2, 2000.
5.
R. Duda, P. Hart, and D. Stork, Pattern Classification. Wiley-Interscience Publication, 2000.
6.
M. R. Everingham, B. T. Thomas, and T. Troscianko, "Headmounted mobility aid for low vision using scene classification techniques," Int. J. of Virt. Reality, vol. 3, no. 4, 1999.
7.
P. Felzenszwalb and D. Huttenlocher, "Efficient graph-based image segmentation," IJCV, vol. 59, no. 2, 2004.
8.
J. Friedman, T. Hastie, and R. Tibshirani, "Additive logistic regression: a statistical view of boosting," Annals of Statistics, vol. 28, no. 2, 2000.
9.
F. Han and S.-C. Zhu, "Bayesian reconstruction of 3d shapes and scenes from a single image," in Int. Work, on Higher-Level Know, in 3D Modeling and Motion Anal., 2003.
10.
R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press, 2004.
11.
X. He, R. S. Zemel, and M. Á. Carreira-Perpiñán, "Multiscale conditional random fields for image labeling." in Proc. CVPR, 2004.
12.
D. Hoiem, A. A. Efros, and M. Hebert, "Automatic photo pop-up," in ACM SIGGRAPH 2005.
13.
W. Hong, A. Y. Yang, K. Huang, and Y. Ma, "On symmetry and multiple-view geometry: Structure, pose, and calibration from a single image," IJCV, vol. 60, no. 3, 2004.
14.
S. Konishi and A. Yuille, "Statistical cues for domain specific image segmentation with performance analysis." in Proc. CVPR, 2000.
15.
J. Kosecka and W. Zhang, "Video compass," in Proc. ECCV. Springer-Verlag, 2002.
16.
S. Kumar and M. Hebert, "Discriminative random fields: A discriminative framework for contextual interaction in classification," in Proc. ICCV. IEEE Comp. Society, 2003.
17.
D. Liebowitz, A. Criminisi, and A. Zisserman, "Creating architectural models from images," in Proc. EuroGraphics, vol. 18, 1999.
18.
D. Marr, Vision. San Francisco: Freeman, 1982.
19.
K. Mikolajczyk, C. Schmid, and A. Zisserman, "Human detection based on a probabilistic assembly of robust part detectors," in Proc. ECCV. Springer-Verlag, May 2004.
20.
K. Murphy, A. Torralba, and W. T. Freeman, "Graphical model for recognizing scenes and objects," in Proc. NIPS, 2003.
21.
Y. Ohta, Knowledge-Based Interpretation Of Outdoor Natural Color Scenes. Pitman, 1985.
22.
"The pascal object recognition database collection," Website, PASCAL Challenges Workshop, 2005, http://www. pascal-network.org/challenges/VOC/ .
23.
M. Pollefeys, R. Koch, and L. J. V. Gool, "Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters," in Proc. ICCV, 1998.
24.
X. Ren and J. Malik, "Learning a classification model for segmentation," in Proc. ICCV, 2003.
25.
U. Rutishauser, D. Walther, C. Koch, and P. Perona, "Is bottom-up attention useful for object recognition," in Proc. CVPR, 2004.
26.
H. Schneiderman, "Learning a restricted bayesian network for object detection," in Proc. CVPR, 2004.
27.
J. Shi and J. Malik, "Normalized cuts and image segmentation," IEEE Trans. PAMI, vol. 22, no. 8, August 2000.
28.
A. Singhal, J. Luo, and W. Zhu, "Probabilistic spatial context models for scene content understanding." in Proc. CVPR, 2003.
29.
A. Torralba, "Contextual priming for object detection," IJCV, vol. 53, no. 2, 2003.
30.
A. Torralba, K. P. Murphy, and W. T. Freeman, "Contextual models for object detection using boosted random fields," in Proc. NIPS, 2004.
Contact IEEE to Subscribe

References

References is not available for this document.