Online Multi-Scale Classification and Global Feature Modulation for Robust Visual Tracking | IEEE Journals & Magazine | IEEE Xplore

Online Multi-Scale Classification and Global Feature Modulation for Robust Visual Tracking


Abstract:

Recent advanced trackers, composed of discriminative classification and dedicated bounding box estimation, have achieved remarkable advancements in performance of visual ...Show More

Abstract:

Recent advanced trackers, composed of discriminative classification and dedicated bounding box estimation, have achieved remarkable advancements in performance of visual object tracking. However, existing methods cannot satisfy the demands of tracking tasks in complex scenes, such as occlusion, scale variations, and etc. To this end, we propose a novel online multi-scale classification and global feature modulation for robust visual tracking, which is developed over accurate tracking by overlap maximization, named ATOM+. First, coordinate attention (CA) is applied to enhance the target features in the channel dimension and spatial dimension, which can effectively optimize the feature representation ability of the backbone network. Second, an online multi-scale classification (OMC) module is designed. During the online tracking phase, more reliable matching responses are comprehensively generated by aggregating information from different scales related to the target. This new operation enables stable perception of the target by the tracker, particularly when severe changes in the appearance and posture of the target are encountered. Third, a global feature modulation (GFM) mechanism is constructed, which requires only a small amount of computational resources, to fuse the spatial contextual information of the template image into the search region. This integration refines the bounding box to obtain an accurate estimate of the target state. Finally, comprehensive experiments on conventional tracking benchmarks of OTB100, LaSOT, and VOT2018 show that our tracker can sufficiently address different challenging scenarios, and achieves state-of-the-art performance. For the average running speed, our tracker can achieve 37 FPS in real time.
Page(s): 5321 - 5334
Date of Publication: 18 December 2023

ISSN Information:

Funding Agency:


I. Introduction

Visual object tracking is a critical technology in the computer vision field, that is widely used in intelligent driving [1], human-computer interaction [2] and video surveillance [3]. The core task is to automatically estimate the position and shape of the target in subsequent frames after the target is given an initial position in the first frame of the video. However, the presence of a complex background during tracking can cause problems such as occlusion and similar target interference, which renders the task of object tracking challenging.

Contact IEEE to Subscribe

References

References is not available for this document.