Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images | IEEE Journals & Magazine | IEEE Xplore

Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images


Abstract:

Object detection in very high resolution optical remote sensing images is a fundamental problem faced for remote sensing image analysis. Due to the advances of powerful f...Show More

Abstract:

Object detection in very high resolution optical remote sensing images is a fundamental problem faced for remote sensing image analysis. Due to the advances of powerful feature representations, machine-learning-based object detection is receiving increasing attention. Although numerous feature representations exist, most of them are handcrafted or shallow-learning-based features. As the object detection task becomes more challenging, their description capability becomes limited or even impoverished. More recently, deep learning algorithms, especially convolutional neural networks (CNNs), have shown their much stronger feature representation power in computer vision. Despite the progress made in nature scene images, it is problematic to directly use the CNN feature for object detection in optical remote sensing images because it is difficult to effectively deal with the problem of object rotation variations. To address this problem, this paper proposes a novel and effective approach to learn a rotation-invariant CNN (RICNN) model for advancing the performance of object detection, which is achieved by introducing and learning a new rotation-invariant layer on the basis of the existing CNN architectures. However, different from the training of traditional CNN models that only optimizes the multinomial logistic regression objective, our RICNN model is trained by optimizing a new objective function via imposing a regularization constraint, which explicitly enforces the feature representations of the training samples before and after rotating to be mapped close to each other, hence achieving rotation invariance. To facilitate training, we first train the rotation-invariant layer and then domain-specifically fine-tune the whole RICNN network to further boost the performance. Comprehensive evaluations on a publicly available ten-class object detection data set demonstrate the effectiveness of the proposed method.
Published in: IEEE Transactions on Geoscience and Remote Sensing ( Volume: 54, Issue: 12, December 2016)
Page(s): 7405 - 7415
Date of Publication: 05 September 2016

ISSN Information:

Funding Agency:


I. Introduction

Object detection in very high resolution (VHR) optical remote sensing images is a fundamental problem faced for aerial and satellite image analysis. In recent years, due to the advance of the machine learning technique, particularly the powerful feature representations and classifiers, many approaches regard object detection as a classification problem and have shown impressive success for some specific object detection tasks [1]– [24]. In these approaches, object detection can be performed by learning a classifier, such as support vector machine (SVM) [1], [7], [8], [12], [13], [20]– [24], AdaBoost [2]– [5], -nearest neighbors [15], [17], conditional random field [6], [19], and sparse-coding-based classifier [9]– [11], [14], [16], which captures the variation in object appearances and views from a set of training data in a supervised [2]– [7], [9]– [14], [16]– [21], [23], [24] or semisupervised [15], [22] or weakly supervised framework [1], [8], [25], [51]. The input of the classifier is a set of image regions with their corresponding feature representations, and the output is their predicted labels, i.e., object or not. A recent review on object detection in optical remote sensing images can be found in [26].

Contact IEEE to Subscribe

References

References is not available for this document.