Loading [MathJax]/extensions/MathMenu.js
Semantic Segmentation Method for Remote Sensing Urban Scenes Based on Swin-Transformer and Convolutional Neural Network | IEEE Conference Publication | IEEE Xplore

Semantic Segmentation Method for Remote Sensing Urban Scenes Based on Swin-Transformer and Convolutional Neural Network


Abstract:

The task of semantic segmentation of urban scenes in remote sensing has extensive applications in land cover mapping, urban change detection, and environmental protection...Show More

Abstract:

The task of semantic segmentation of urban scenes in remote sensing has extensive applications in land cover mapping, urban change detection, and environmental protection. However, due to the significant intra-class heterogeneity, inter-class similarity, and the presence of small-scale objects in remote sensing urban scenes, convolutional neural networks (CNNs) often struggle to fully utilize contextual information, resulting in low segmentation accuracy, incomplete segmentation, and misclassification of similar classes. To address these issues, we propose the Swin-MDFF for semantic segmentation of remote sensing urban scenes. This method employs an encoder-decoder structure that combines Swin-Transformer and CNNs. The Swin-Transformer serves as the encoder to extract multi-scale semantic features and contextual information, while the Multi-Scale Dilated Feature Fusion (MDFF) CNNs as the decoder to aggregate multi-scale semantic features, effectively considering both local and global contextual information. This method has been tested on the Vaihingen and Potsdam datasets, outperforming mainstream networks with mean Intersection over Union (mIoU) of 84.32% and 87.82%, and mean F1-score (mF1) of 91.35% and 93.40%, respectively. The experimental results demonstrate the effectiveness of this method in the semantic segmentation of remote sensing urban scenes.
Date of Conference: 20-22 September 2024
Date Added to IEEE Xplore: 04 November 2024
ISBN Information:

ISSN Information:

Conference Location: Chongqing, China
No metrics found for this document.

I. Introduction

With the continuous advancement of sensor technology, a large number of high-resolution remote sensing images are being captured and used for semantic segmentation tasks in urban scenes. Remote sensing urban scene semantic segmentation can effectively address various issues in urban planning [1], promote the rational use of urban land [2], and accurately monitor changes in urban buildings [3]. Additionally, it is crucial for monitoring urban road traffic facilities [4], green space planning [5], and environmental monitoring [6].

Usage
Select a Year
2025

View as

Total usage sinceNov 2024:155
051015202530JanFebMarAprMayJunJulAugSepOctNovDec292729000000000
Year Total:85
Data is updated monthly. Usage includes PDF downloads and HTML views.

References

References is not available for this document.