Conferences >2023 IEEE 3rd International C...

DAPSPNet: Deep Aggregation Pyramid Strip Pooling Network for Real-time and Accurate Segmentation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper introduces an efficient Convolutional Neural Networks (CNN) architecture named DAPSPNet for Real-time semantic segmentation. We propose a novel dual-resolution...Show More

Metadata

Abstract:

This paper introduces an efficient Convolutional Neural Networks (CNN) architecture named DAPSPNet for Real-time semantic segmentation. We propose a novel dual-resolution network, DAPSPNet, and augment it with strip pooling in the multi-scale feature extraction module to extract strip-shaped features more effectively. The convolution kernels have lengths of 5, 9, and 17, with a width of 1. We chose strip pooling as the supplement for two reasons. First, strip pooling is a lightweight technique that reduces the number of parameters and computations involved in pooling operations. Second, there are also some strip-shaped features in the contextual information, which are in line with the needs of real-time semantic segmentation of road scenes. Extensive experimental evaluations on the Cityscapes dataset demonstrate the competitive performance of DAPSPNet compared to several state-of-the-art methods in most scenarios.

Published in: 2023 IEEE 3rd International Conference on Digital Twins and Parallel Intelligence (DTPI)

Date of Conference: 07-09 November 2023

Date Added to IEEE Xplore: 26 December 2023

ISBN Information:

DOI: 10.1109/DTPI59677.2023.10365437

Conference Location: Orlando, FL, USA

Funding Agency:

Contents

I. INTRODUCTION

Semantic segmentation is a fundamental task in computer vision that assigns pixel-level labels to images. Deep convolutional neural networks, starting with Fully Convolutional Networks (FCN) [1], have significantly improved semantic segmentation performance. The demand for real-time semantic segmentation has grown rapidly in applications such as autonomous driving [2], [2]–[4], video surveillance, and robot sensing [5]–[7], driving the need for efficient segmentation networks, especially in the mobile domain. While models like U-Net [8] have shown excellent accuracy, their real-time inference capability is limited. To address this, lightweight backbones and feature fusion/aggregation modules have been explored in methods like MLFNet [9] and BiSeNet [10] to balance speed and accuracy. However, reducing the input resolution can result in loss of fine details. BiSeNet tackles this by combining low-level details and high-level semantics, while HRNet [11] and DDRNet [12] propose parallelism for improved accuracy. Various approaches, including Atrous Spatial Pyramid Pooling (ASPP) [13], Pyramid Pooling Module (PPM) [14], Depthwiseconv Spatial Pyramid (DSP) [9], and Deep Aggregation Pyramid Pooling Module (DAPPM) [12], have also explored capturing semantic information at different scales using receptive fields of different sizes (n×n). However, relying solely on n×n receptive fields is deemed insufficient [15]. In this work, we propose integrating receptive fields of size n×1 and 1×n to enhance the model’s ability to capture semantic information across scales.

References is not available for this document.

DAPSPNet: Deep Aggregation Pyramid Strip Pooling Network for Real-time and Accurate Segmentation

Abstract:

Metadata

Abstract:

Funding Agency:

I. INTRODUCTION

References

IEEE Account

Purchase Details

Profile Information

Need Help?

DAPSPNet: Deep Aggregation Pyramid Strip Pooling Network for Real-time and Accurate Segmentation

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

I. INTRODUCTION

Authors

Figures

References

Citations

Keywords

Metrics

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?