Abstract:
In this paper, we consider two inter-dependent deep networks, where one network taps into the other, to perform two challenging cognitive vision tasks - scene classificat...Show MoreMetadata
Abstract:
In this paper, we consider two inter-dependent deep networks, where one network taps into the other, to perform two challenging cognitive vision tasks - scene classification and object recognition jointly. Recently, convolutional neura networks have shown promising results in each of these tasks. However, as scene and objects are interrelated, the performance of both of these recognition tasks can be further improved by exploiting dependencies between scene and object deep networks. The advantages of considering the inter-dependency between these networks are the following: 1. improvement of accuracy in both scene and object classification, and 2. significant reduction of computational cost in object detection. In order to formulate our framework, we employ two convolutional neural networks (CNNs), scene-CNN and object-CNN. We utilize scene-CNN to generate object proposals which indicate the probable object locations in an image. Object proposals found in the process are semantically relevant to the object. More importantly, the number of object proposals is fewer in amount when compared to other existing methods which reduces the computational cost significantly. Thereafter, in scene classification, we train three hidden layers in order to combine the global (image as a whole) and local features (object information in an image). Features extracted from CNN architecture along with the features processed from object-CNN are combined to perform efficient classification. We perform rigorous experiments on five datasets to demonstrate that our proposed framework outperforms other state-of-the-art methods in classifying scenes as well as recognizing objects.
Date of Conference: 04-08 December 2016
Date Added to IEEE Xplore: 24 April 2017
ISBN Information: