Journals & Magazines >IEEE Transactions on Circuits... >Volume: 29 Issue: 7

Normalized Non-Negative Sparse Encoder for Fast Image Representation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Image representation based on sparse coding generalizes the bag of words model. Although it reduces the reconstruction error for local features to achieve the state-of-th...Show More

Metadata

Abstract:

Image representation based on sparse coding generalizes the bag of words model. Although it reduces the reconstruction error for local features to achieve the state-of-the-art image classification performance, the large computational cost hinders the application of sparse coding-based image features. In this paper, we propose approximating a sparse code using the output of a simple neural network. The resulting parameter learning model for the neural network automatically incorporates non-negative and shift-invariant constraints, leading to an efficient normalized non-negative sparse coding (N³SC) sparse encoder. Without the use of the traditional iterative process to solve the sparse coding objective, the sparse encoder directly “converts” each local feature into a sparse code. We also introduce a method for training the encoder based on the auto-encoder method. In addition, we formally propose the corresponding sparse coding scheme called N³SC, which enforces both the non-negative constraint and the shift-invariant constraint in addition to the traditional sparse coding criteria. As demonstrated by several experiments, the obtained N³SC encoder requires only 3%-10% of the processing time for image feature extraction compared with the standard sparse coding scheme. At the same time, the features extracted using the exact solutions of the N³SC coding scheme and the N³SC encoder offer superior image classification accuracy compared to the accuracy of many existing sparse coding-based representations.

Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 29, Issue: 7, July 2019)

Page(s): 1962 - 1972

Date of Publication: 03 July 2018

ISSN Information:

DOI: 10.1109/TCSVT.2018.2852731

Funding Agency:

Contents

I. Introduction

Many fundamental tasks in the fields of multimedia and computer vision, including image classification, annotation, and retrieval, all rely on the efficient extraction of discriminative features, especially in the big data era. Recently, many researchers have focused on the sparse coding-based representation for problems related to image/video classification [1]–[5], annotation [6], and concept detection [7]. The sparse coding method greatly reduces the reconstruction error of local features, compared to vector quantization (VQ) in the traditional bag of words (BoW) model [8], and can capture the salient properties of images. Combined with spatial pyramid matching [9], the sparse coding algorithm achieves state-of-the-art performance even with simple linear classifiers. On the other hand, one significant limitation of the sparse coding representation is the huge computational cost of coding local features, especially when a large codebook is used. Typically, it costs more than 1 second to solve the sparse coding algorithm for a image. This makes the sparse coding feature difficult to extract for real-world applications. To tackle the problem, some researchers have attempted to enforce the locality of the local feature to reduce the effective size of the codebook. E.g. Yu et al. [3] empirically found that the sparse coding results tend to be local and accordingly proposed the local coordinate coding (LCC) scheme, which uses a subset of codewords to perform sparse coding. Wang et al. [2] further proposed the locality-constrained linear coding (LLC) scheme, which explicitly incorporates the locality constraint instead of the sparsity constraint to obtain an analytical solution. In addition to locality, other researchers have approached the problem either by using supervised codebook learning [10]–[12] to improve the classification accuracy even for small codebook or by utilizing higher-order sparse coding based on a small codebook [5], [13].

References is not available for this document.

Normalized Non-Negative Sparse Encoder for Fast Image Representation

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Normalized Non-Negative Sparse Encoder for Fast Image Representation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References