Loading [MathJax]/extensions/MathZoom.js
STN: Saliency-Guided Transformer Network for Point-Wise Semantic Segmentation of Urban Scenes | IEEE Journals & Magazine | IEEE Xplore

STN: Saliency-Guided Transformer Network for Point-Wise Semantic Segmentation of Urban Scenes


Abstract:

Accurate and effective road object semantic segmentation plays a significant role in supporting extensive intelligent transportation system (ITS)-related applications. Ho...Show More

Abstract:

Accurate and effective road object semantic segmentation plays a significant role in supporting extensive intelligent transportation system (ITS)-related applications. However, most existing image-based methods and point-based methods cannot deliver promising solutions with respect to segmentation accuracy and robustness, especially in complex urban road scenes. Thus, we design a saliency-guided transformer architecture (STN) in this letter for point-wise semantic segmentation from mobile laser scanning (MLS) point clouds. First, four types of feature saliency maps are constructed to obtain more compact feature spaces for enhancing the feature encoding semantics. Then, integrated with offset attention mechanisms and edge convolutions, an effective point-wise transformer network is proposed to extract high-level features for point-wise label assignment of road objects. The STN model is evaluated on the Pairs-Lille-3D (PL3D) dataset and achieves satisfactory experimental results with 87.2% overall accuracy (OA) and 81.7% mean intersection-over-union (IoU), respectively. Comparative studies with five deep learning-based methods also prove the superior performance of the STN model for large-scale semantic segmentation tasks.
Published in: IEEE Geoscience and Remote Sensing Letters ( Volume: 19)
Article Sequence Number: 7004405
Date of Publication: 14 July 2022

ISSN Information:

Funding Agency:

No metrics found for this document.

I. Introduction

The point-wise segmentation task aiming to determine the semantic label point-by-point in the entire point clouds is a remarkably essential process to support extensive applications, including intelligent robotics, autonomous vehicles, and digital twins. Compared to the 2-D optical images, the 3-D point clouds could more precisely and frequently monitor the spatial information, orientation, and geometric shape attributes of road objects. Most significantly, they are less sensitive to illumination conditions, shadow influence, and viewpoint variations [1]. Unlike 2-D images with a regular grid structure, 3-D point clouds captured by light detection and ranging (LiDAR) sensors are in an unorganized data format and disordered distributions, making it challenging to achieve efficient and accurate road object semantic segmentation, especially in complex and large-scale urban areas [2].

Usage
Select a Year
2025

View as

Total usage sinceJul 2022:454
02468JanFebMarAprMayJunJulAugSepOctNovDec617000000000
Year Total:14
Data is updated monthly. Usage includes PDF downloads and HTML views.
Contact IEEE to Subscribe

References

References is not available for this document.