Conferences >2021 IEEE/CVF Conference on C...

Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Panoptic segmentation brings together two separate tasks: instance and semantic segmentation. Although they are related, unifying them faces an apparent paradox: how to l...Show More

Metadata

Abstract:

Panoptic segmentation brings together two separate tasks: instance and semantic segmentation. Although they are related, unifying them faces an apparent paradox: how to learn simultaneously instance-specific and category-specific (i.e. instance-agnostic) representations jointly. Hence, state-of-the-art panoptic segmentation methods use complex models with a distinct stream for each task. In contrast, we propose Hierarchical Lovász Embeddings, per pixel feature vectors that simultaneously encode instance- and categorylevel discriminative information. We use a hierarchical Lovász hinge loss to learn a low-dimensional embedding space structured into a unified semantic and instance hierarchy without requiring separate network branches or object proposals. Besides modeling instances precisely in a proposal-free manner, our Hierarchical Lovász Embeddings generalize to categories by using a simple Nearest-Class-Mean classifier, including for non-instance "stuff" classes where instance segmentation methods are not applicable. Our simple model achieves state-of-the-art results compared to existing proposal-free panoptic segmentation methods on Cityscapes, COCO, and Mapillary Vistas. Furthermore, our model demonstrates temporal stability between video frames.

Published in: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 20-25 June 2021

Date Added to IEEE Xplore: 02 November 2021

ISBN Information:

ISSN Information:

DOI: 10.1109/CVPR46437.2021.01418

Conference Location: Nashville, TN, USA

Contents

1. Introduction

Holistic scene understanding is an important task in computer vision, where a model is trained to explain each pixel in an image, whether that pixel describes stuff – uncountable regions of similar texture such as grass, road or sky – or thing – a countable object with individually identifying characteristics, such as people or cars. While holistic scene understanding received some early attention [49], [55], [48], modern deep learning-based methods have mainly tackled the tasks of modeling stuff and things independently under the task names semantic segmentation and instance segmentation. Recently, Kirillov et al. proposed the panoptic quality (PQ) metric for unifying these two parallel tracks into the holistic task of panoptic segmentation [24]. Panoptic segmentation is a key step for visual understanding, with applications in fields such as autonomous driving or robotics, where it is crucial to know both the locations of dynamically trackable things, as well as static stuff classes. For example, an autonomous car needs to be able to both avoid other cars with high precision, as well as understand the location of the road and sidewalk to stay on a desired path.

References is not available for this document.

MIT Libraries

MIT Libraries

Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Hierarchical Lovász Embeddings for Proposal-free Panoptic Segmentation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?