Loading [MathJax]/extensions/MathMenu.js
Compensating for Local Ambiguity With Encoder-Decoder in Urban Scene Segmentation | IEEE Journals & Magazine | IEEE Xplore

Compensating for Local Ambiguity With Encoder-Decoder in Urban Scene Segmentation


Abstract:

Semantic segmentation plays a critical role in scene understanding for self-driving vehicles. A line of efforts has proven that global context matters in urban scene segm...Show More

Abstract:

Semantic segmentation plays a critical role in scene understanding for self-driving vehicles. A line of efforts has proven that global context matters in urban scene segmentation due to massive scale changes. However, we find that existing methods suffer from local ambiguities when dissipating continuous local context, i.e. scrambling to a huge receptive field of global cues by coarse pooling. To this end, this paper proposes a new Context Aggregation Module (CAM) that consists of two primary components: context encoding using no coarse pooling but encoder-decoders with appropriate sampling scales and gated fusion that extends gate attention mechanism to balance different-scale context during feature fusion. Weeding out coarse pooling and applying the encoder-decoder inherits the merits of exploring global context while avoiding the drawback of losing local contextual continuity. We then construct a Context Aggregation Network (CANet) and conduct extensive evaluations on challenging autonomous driving benchmarks of Cityscapes, CamVid and BDD100K. Consistently improved results evidence the effectiveness. Notably, we attain competitive mIoU 82.7% on Cityscapes and optimal mIoU 80.5% on CamVid.
Published in: IEEE Transactions on Intelligent Transportation Systems ( Volume: 23, Issue: 10, October 2022)
Page(s): 19224 - 19235
Date of Publication: 14 March 2022

ISSN Information:

Funding Agency:


I. Introduction

Semantic image segmentation is a dense predictive task that involves assigning mutually exclusive categories for each pixel. It lays a solid foundation for a broad range of high-level vision applications, e.g. autonomous driving, medical diagnosis and image editing. In the environments of intelligent vehicles, semantic segmentation serves as a vital ingredient of the vehicle’s navigation system by locating frontal objects such as roads, vehicles or pavements. It addresses most perception needs in a unified way using monocular cameras, not limited to target tracking [1], lane detection [2], semantic stereo matching [3] or shadow region detection [4].

Contact IEEE to Subscribe

References

References is not available for this document.