Conferences >2017 IEEE Conference on Compu...

Active Convolution: Learning the Shape of Convolution for Image Classification

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In recent years, deep learning has achieved great success in many computer vision applications. Convolutional neural networks (CNNs) have lately emerged as a major approa...Show More

Metadata

Abstract:

In recent years, deep learning has achieved great success in many computer vision applications. Convolutional neural networks (CNNs) have lately emerged as a major approach to image classification. Most research on CNNs thus far has focused on developing architectures such as the Inception and residual networks. The convolution layer is the core of the CNN, but few studies have addressed the convolution unit itself. In this paper, we introduce a convolution unit called the active convolution unit (ACU). A new convolution has no fixed shape, because of which we can define any form of convolution. Its shape can be learned through backpropagation during training. Our proposed unit has a few advantages. First, the ACU is a generalization of convolution, it can define not only all conventional convolutions, but also convolutions with fractional pixel coordinates. We can freely change the shape of the convolution, which provides greater freedom to form CNN structures. Second, the shape of the convolution is learned while training and there is no need to tune it by hand. Third, the ACU can learn better than a conventional unit, where we obtained the improvement simply by changing the conventional convolution to an ACU. We tested our proposed method on plain and residual networks, and the results showed significant improvement using our method on various datasets and architectures in comparison with the baseline. Code is available at https://github.com/jyh2986/Active-Convolution.

Published in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 21-26 July 2017

Date Added to IEEE Xplore: 09 November 2017

ISBN Information:

Print ISSN: 1063-6919

DOI: 10.1109/CVPR.2017.200

Conference Location: Honolulu, HI, USA

Contents

1. Introduction

Following the success of deep learning in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [20], the best performance in classification competitions has almost invariably been achieved on convolutional neural network (CNN) architectures. AlexNet [16] is composed of three types of receptive field convolutions . VGG [21] is based on the idea that a stack of two convolutional layers with a receptive field is more effective than a convolution. GoogleNet [24]–[26] introduced an Inception layer for the composition of various receptive fields. The residual network [10], [11], [29], which adds shortcut connections to implement identity mapping, allows more layers to be stacked without running into the gradient vanishing problem. Recent research on CNNs has mostly focused on composing layers rather than the convolution itself.

References is not available for this document.

MIT Libraries

MIT Libraries

Active Convolution: Learning the Shape of Convolution for Image Classification

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Active Convolution: Learning the Shape of Convolution for Image Classification

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

References