Conferences >2014 IEEE Conference on Compu...

Scene Parsing with Object Instances and Occlusion Ordering

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This work proposes a method to interpret a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together wit...Show More

Metadata

Abstract:

This work proposes a method to interpret a scene by assigning a semantic label at every pixel and inferring the spatial extent of individual object instances together with their occlusion relationships. Starting with an initial pixel labeling and a set of candidate object masks for a given test image, we select a subset of objects that explain the image well and have valid overlap relationships and occlusion ordering. This is done by minimizing an integer quadratic program either using a greedy method or a standard solver. Then we alternate between using the object predictions to refine the pixel labels and vice versa. The proposed system obtains promising results on two challenging subsets of the LabelMe and SUN datasets, the largest of which contains 45, 676 images and 232 classes.

Published in: 2014 IEEE Conference on Computer Vision and Pattern Recognition

Date of Conference: 23-28 June 2014

Date Added to IEEE Xplore: 25 September 2014

Electronic ISBN:978-1-4799-5118-5

ISSN Information:

DOI: 10.1109/CVPR.2014.479

Conference Location: Columbus, OH, USA

Citations are not available for this document.

Contents

1 Introduction

Many state-of-the-art image parsing or semantic segmentation methods attempt to compute a labeling of every pixel or segmentation region in an image [2], [4], [7], [14], [15], [19], [20]. Despite their rapidly increasing accuracy, these methods have several limitations. First, they have no notion of object instances - given an image with multiple nearby or overlapping cars, these methods are likely to produce a blob of “car” labels instead of separately delineated instances (Figure 1(a)). In addition, pixel labeling methods tend to be more accurate for “stuff” classes that are characterized by local appearance rather than overall shape - classes such as road, sky, tree, and building. To do better on “thing” classes such as car, cat, person, and vase - as well as to gain the ability to represent object instances - it becomes necessary to incorporate detectors that model the overall object shape.

Cites in Papers - |

Cites in Papers - Other Publishers (28)

Congying An, Jingjing Wu, Huanlong Zhang, "Occlusion-aware segmentation via RCF-Pix2Pix generative network", The Visual Computer, 2024.

Scene Parsing with Object Instances and Occlusion Ordering

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1 Introduction

Cites in Papers - IEEE (60) | Other Publishers (28)

Cites in Papers - IEEE (60)

Cites in Papers - Other Publishers (28)

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Cites in Papers - |