Loading [MathJax]/extensions/MathMenu.js
Target-Aware Deep Tracking | IEEE Conference Publication | IEEE Xplore

Target-Aware Deep Tracking


Abstract:

Existing deep trackers mainly use convolutional neural networks pre-trained for the generic object recognition task for representations. Despite demonstrated successes fo...Show More

Abstract:

Existing deep trackers mainly use convolutional neural networks pre-trained for the generic object recognition task for representations. Despite demonstrated successes for numerous vision tasks, the contributions of using pre-trained deep features for visual tracking are not as significant as that for object recognition. The key issue is that in visual tracking the targets of interest can be arbitrary object class with arbitrary forms. As such, pre-trained deep features are less effective in modeling these targets of arbitrary forms for distinguishing them from the background. In this paper, we propose a novel scheme to learn target-aware features, which can better recognize the targets undergoing significant appearance variations than pre-trained deep features. To this end, we develop a regression loss and a ranking loss to guide the generation of target-active and scale-sensitive features. We identify the importance of each convolutional filter according to the back-propagated gradients and select the target-aware features based on activations for representing the targets. The target-aware features are integrated with a Siamese matching network for visual tracking. Extensive experimental results show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of accuracy and speed.
Date of Conference: 15-20 June 2019
Date Added to IEEE Xplore: 09 January 2020
ISBN Information:

ISSN Information:

Conference Location: Long Beach, CA, USA

1. Introduction

Visual tracking is one of the fundamental computer vision problems with a wide range of applications. Given a target object specified by a bounding box in the first frame, visual tracking aims to locate the target object in the subsequent frames. This is challenging as target objects often undergo significant appearance changes over time and may temporally leave the field of the view. Conventional trackers prior to the advances of deep learning mainly consist of a feature extraction module and a decision-making mechanism. The recent state-of-the-art deep trackers often use deep models pre-trained for the object recognition task to extract features, while putting more emphasis on designing effective decision-making modules. While various decision models, such as correlation filters [15], regressors [14], [35], [38], [37], and classifiers [16], [29], [32], are extensively explored, considerably less attention is paid to learning more discriminative deep features.

Contact IEEE to Subscribe

References

References is not available for this document.