Conferences >2012 IEEE Conference on Compu...

Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper we propose an approach to holistic scene understanding that reasons jointly about regions, location, class and spatial extent of objects, presence of a clas...Show More

Metadata

Abstract:

In this paper we propose an approach to holistic scene understanding that reasons jointly about regions, location, class and spatial extent of objects, presence of a class in the image, as well as the scene type. Learning and inference in our model are efficient as we reason at the segment level, and introduce auxiliary variables that allow us to decompose the inherent high-order potentials into pairwise potentials between a few variables with small number of states (at most the number of classes). Inference is done via a convergent message-passing algorithm, which, unlike graph-cuts inference, has no submodularity restrictions and does not require potential specific moves. We believe this is very important, as it allows us to encode our ideas and prior knowledge about the problem without the need to change the inference engine every time we introduce a new potential. Our approach outperforms the state-of-the-art on the MSRC-21 benchmark, while being much faster. Importantly, our holistic model is able to improve performance in all tasks.

Published in: 2012 IEEE Conference on Computer Vision and Pattern Recognition

Date of Conference: 16-21 June 2012

Date Added to IEEE Xplore: 26 July 2012

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR.2012.6247739

Conference Location: Providence, RI, USA

Citations are not available for this document.

Contents

1. Introduction

While there has been significant progress in solving tasks such as image labeling [14], object detection [5] and scene classification [26], existing approaches could benefit from solving these problems jointly [9]. For example, segmentation should be easier if we know where the object of interest is. Similarly, if we know the type of the scene, we can narrow down the classes we are expected to see, e.g., if we are looking at the sea, we are more likely to see a boat than a cow. Conversely, if we know which semantic regions (e.g., sky, road) and which objects are present in the scene, we can more accurately infer the scene type. Holistic scene understanding aims at recovering multiple related aspects of a scene so as to provide a deeper understanding of the scene as a whole.

Cites in Papers - |

Cites in Papers - Other Publishers (19)

Zehan Tan , Weidong Yang , Zhiwei Zhang , " PyraBiNet: A Hybrid Semantic Segmentation Network Combining PVT and\xa0BiSeNet for\xa0Deformable Objects in\xa0Indoor Environments ", Neural Information Processing , vol. 1968 , pp. 552 , 2024 .

MIT Libraries

MIT Libraries

Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

Cites in Papers - IEEE (47) | Other Publishers (19)

Cites in Papers - IEEE (47)

Cites in Papers - Other Publishers (19)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Cites in Papers - |