Abstract:
Object detection plays a crucial role in scene understanding and has extensive practical applications. In the field of remote sensing object detection, both detection acc...Show MoreMetadata
Abstract:
Object detection plays a crucial role in scene understanding and has extensive practical applications. In the field of remote sensing object detection, both detection accuracy and robustness are of significant concern. Existing methods heavily rely on sophisticated adversarial training strategies that tend to improve robustness at the expense of accuracy. However, detection robustness is not always indicative of improved accuracy. Therefore, in this article, we research how to enhance robustness, while still preserving high accuracy, or even improve both simultaneously, with simple vanilla adversarial training or even in the absence thereof. In pursuit of a solution, we first conduct an exploratory investigation by shifting our attention from adversarial training, referred to as adversarial fine-tuning, to adversarial pretraining. Specifically, we propose a novel pretraining paradigm, namely, structured adversarial self-supervised (SASS) pretraining, to strengthen both clean accuracy and adversarial robustness for object detection in remote sensing images. At a high level, SASS pretraining aims to unify adversarial learning and self-supervised learning into pretraining and encode structured knowledge into pretrained representations for powerful transferability to downstream detection. Moreover, to fully explore the inherent robustness of vision Transformers and facilitate their pretraining efficiency, by leveraging the recent masked image modeling (MIM) as the pretext task, we further instantiate SASS pretraining into a concise end-to-end framework, named structured adversarial MIM (SA-MIM). SA-MIM consists of two pivotal components: structured adversarial attack and structured MIM (S-MIM). The former establishes structured adversaries for the context of adversarial pretraining, while the latter introduces a structured local-sampling global-masking strategy to adapt to hierarchical encoder architectures. Comprehensive experiments on three different datasets have dem...
Published in: IEEE Transactions on Geoscience and Remote Sensing ( Volume: 62)
Citations are not available for this document.
Cites in Papers - |
Cites in Papers - IEEE (15)
Select All
1.
Zongqi He, Zhe Xiao, Kin-Chung Chan, Yushen Zuo, Jun Xiao, Kin-Man Lam, "See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization", ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.1-5, 2025.
2.
Huan Liu, Xuefeng Ren, Yang Gan, Yongming Chen, Ping Lin, "DIMD-DETR: DDQ-DETR With Improved Metric Space for End-to-End Object Detector on Remote Sensing Aircrafts", IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol.18, pp.4498-4509, 2025.
3.
Kaichen Chi, Sai Guo, Jun Chu, Qiang Li, Qi Wang, "RSMamba: Biologically Plausible Retinex-Based Mamba for Remote Sensing Shadow Removal", IEEE Transactions on Geoscience and Remote Sensing, vol.63, pp.1-10, 2025.
4.
Jiaxin Wei, Guobin Zhu, Xiliang Chen, "NeRF-Based Large-Scale Urban True Digital Orthophoto Map Generation Method", IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol.18, pp.1070-1084, 2025.
5.
Fei Wu, Jun Yin, Xiaochuan Li, Jianfeng Wu, Da Jin, Jiamin Yang, "CoNet: A Consistency-Oriented Network for Camouflaged Object Segmentation", IEEE Transactions on Circuits and Systems for Video Technology, vol.35, no.1, pp.287-299, 2025.
6.
Bo Liu, Chengrong Yang, Jing Guo, Yun Yang, "A Novel Semi-Supervised Object Detection Approach via Scale Rebalancing and Global Proposal Contrast Consistency", IEEE Transactions on Circuits and Systems for Video Technology, vol.35, no.1, pp.232-244, 2025.
7.
Zihang Lyu, Jun Xiao, Cong Zhang, Kin-Man Lam, "AI-Generated Image Detection With Wasserstein Distance Compression and Dynamic Aggregation", 2024 IEEE International Conference on Image Processing (ICIP), pp.3827-3833, 2024.
8.
Yun Li, Hao Xie, Jun Xiao, Cong Zhang, Tianshan Liu, Kin-Man Lam, "Hierarchical Vertex-Wise Intensification Graph Convolution for Skeleton-Based Activity Recognition", 2024 IEEE International Conference on Image Processing (ICIP), pp.2702-2708, 2024.
9.
Xu Liu, Yang Zhao, Kaichen Chi, Zhao Zhang, Yanxiang Chen, Wei Jia, "Toward Individual Tone Preference in Underwater Image Enhancement", IEEE Transactions on Geoscience and Remote Sensing, vol.62, pp.1-11, 2024.
10.
Guozheng Nan, Yue Zhao, Chengxing Lin, Qiaolin Ye, "General Optimization Methods for YOLO Series Object Detection in Remote Sensing Images", IEEE Signal Processing Letters, vol.31, pp.2860-2864, 2024.
11.
Beihang Song, Jing Li, Jia Wu, Jun Chang, Jun Wan, "Direction Prediction Redefinition: Transfer Angle to Scale in Oriented Object Detection", IEEE Transactions on Circuits and Systems for Video Technology, vol.34, no.12, pp.12894-12906, 2024.
12.
Xiao Ke, Qiuqin Chen, Hao Liu, Wenzhong Guo, "GFENet: Generalization Feature Extraction Network for Few-Shot Object Detection", IEEE Transactions on Circuits and Systems for Video Technology, vol.34, no.12, pp.12741-12755, 2024.
13.
Zhongxing Peng, Yilin Gao, Shiyi Mu, Shugong Xu, "Toward Reliable License Plate Detection in Varied Contexts: Overcoming the Issue of Undersized Plate Annotations", IEEE Transactions on Intelligent Transportation Systems, vol.25, no.11, pp.18107-18121, 2024.
14.
Hao Li, Rong Pan, Gang Liu, Min Dang, Qijie Xu, Xu Wang, Bo Wan, "TIR-Net: Task Integration Based on Rotated Convolution Kernel for Oriented Object Detection in Aerial Images", IEEE Transactions on Geoscience and Remote Sensing, vol.62, pp.1-13, 2024.
15.
Cong Zhang, Jun Xiao, Cuixin Yang, Jingchun Zhou, Kin-Man Lam, Qi Wang, "Integrally Mixing Pyramid Representations for Anchor-Free Object Detection in Aerial Imagery", IEEE Geoscience and Remote Sensing Letters, vol.21, pp.1-5, 2024.