Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks | IEEE Journals & Magazine | IEEE Xplore

Deep Learning Earth Observation Classification Using ImageNet Pretrained Networks


Abstract:

Deep learning methods such as convolutional neural networks (CNNs) can deliver highly accurate classification results when provided with large enough data sets and respec...Show More

Abstract:

Deep learning methods such as convolutional neural networks (CNNs) can deliver highly accurate classification results when provided with large enough data sets and respective labels. However, using CNNs along with limited labeled data can be problematic, as this leads to extensive overfitting. In this letter, we propose a novel method by considering a pretrained CNN designed for tackling an entirely different classification problem, namely, the ImageNet challenge, and exploit it to extract an initial set of representations. The derived representations are then transferred into a supervised CNN classifier, along with their class labels, effectively training the system. Through this two-stage framework, we successfully deal with the limited-data problem in an end-to-end processing scheme. Comparative results over the UC Merced Land Use benchmark prove that our method significantly outperforms the previously best stated results, improving the overall accuracy from 83.1% up to 92.4%. Apart from statistical improvements, our method introduces a novel feature fusion algorithm that effectively tackles the large data dimensionality by using a simple and computationally efficient approach.
Published in: IEEE Geoscience and Remote Sensing Letters ( Volume: 13, Issue: 1, January 2016)
Page(s): 105 - 109
Date of Publication: 01 December 2015

ISSN Information:

References is not available for this document.

I. Introduction

Supervised classification of very high spatial resolution (VHSR) images is still an open research topic in the remote sensing (RS) field. Monitoring urbanization trends has become a crucial objective, and there is currently a high demand for such automatic RS classification techniques. Toward this direction in the last years, advanced methodologies have significantly contributed to the solution of the VHSR classification problem. Predominantly, methods based on the bag-of-visual-words (BoVW) approach have been proposed for solving this task by learning a dictionary for representing the image content in an unsupervised manner through the use of well-established feature descriptors (HOG, SIFT, etc.) and clustering algorithms. Such approaches include the spatial pyramid matching kernel (SPMK) [1], spatial pyramid cooccurrence kernel (SPCK++) [2], min-tree kd-tree [3], and sparse coding [4] methods. However, the main drawback of all these techniques lies in the assumption that a general feature descriptor can adequately represent the complex image structures by employing expert knowledge through manually designed all-around purpose features.

Select All
1.
S. Lazebnik, C. Schmid and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories", Proc. IEEE Comput. Vis. Pattern Recog., vol. 2, pp. 2169-2178, 2006.
2.
Y. Yang and S. Newsam, "Spatial pyramid co-occurrence for image classification", Proc. IEEE ICCV, pp. 1465-1472, 2011.
3.
L. Gueguen, "Classifying compound structures in satellite images: A compressed representation for fast queries", IEEE Trans. Geosci. Remote Sens., vol. 53, no. 4, pp. 1803-1818, Apr. 2015.
4.
A. M. Cheriyadat, "Unsupervised feature learning for aerial scene classification", IEEE Trans. Geosci. Remote Sens., vol. 52, no. 1, pp. 439-451, Jan. 2014.
5.
P. Vincent, H. Larochelle, Y. Bengio and P.-A. Manzagol, "Extracting and composing robust features with denoising autoencoders", Proc. 25th Int. Conf. Mach. Learn., pp. 1096-1103, 2008.
6.
F. Zhang, B. Du and L. Zhang, "Saliency-guided unsupervised feature learning for scene classification", IEEE Trans. Geosci. Remote Sens., vol. 53, no. 4, pp. 2175-2184, Apr. 2015.
7.
O. Firat, G. Can and F. T. Yarman Vural, "Representation learning for contextual object and region detection in remote sensing", Proc. 22nd IEEE ICPR, pp. 3708-3713, 2014.
8.
J. Donahue et al., "Decaf: A deep convolutional activation feature for generic visual recognition", 2013, [online] Available: http://arxiv.org/abs/1310.1531.
9.
P. Sermanet et al., "Overfeat: Integrated recognition localization and detection using convolutional networks", 2013, [online] Available: http://arxiv.org/abs/1312.6229.
10.
P. Sermanet and Y. LeCun, "Traffic sign recognition with multi-scale convolutional networks", Proc. IJCNN, pp. 2809-2813, 2011.
11.
R. Socher, B. Huval, B. Bath, C. D. Manning and A. Y. Ng, "Convolutional-recursive deep learning for 3D object classification", Proc. Adv. Neural Inf. Process. Syst., pp. 665-673, 2012.
12.
A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks", Proc. Adv. Neural Inf. Process. Syst., pp. 1097-1105, 2012.
Contact IEEE to Subscribe

References

References is not available for this document.