Conferences >2024 IEEE 7th Advanced Inform...

Action Association Learning Network for Weakly-supervised Temporal Action Localization

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Weakly-supervised Temporal action localization (WTAL) is a tough task in computer vision that aims to identify the temporal boundaries of actions in videos, whereas only ...Show More

Metadata

Abstract:

Weakly-supervised Temporal action localization (WTAL) is a tough task in computer vision that aims to identify the temporal boundaries of actions in videos, whereas only video-level labels available. However, most existing approaches give scant consideration to the action association information in videos. In this paper, we investigate the impact of action associations and propose a new framework: Action Association Learning Network (AAL-Net) for temporal localization. Firstly, we introduce attention modules to exploit association information in snippet features. The class-aware and class-agnostic action associations are fused into the features respectively. Additionally, a hybrid Class Activation Sequence (CAS) strategy based on adaptive thresholds is proposed to increase the proposal pool and adjust temporal attention weights. Thirdly, during action proposal generation, we propose a method to adjust initial proposals using the learned action association weights. Finally, we conduct extensive comparative experiments on two datasets to evaluate the effectiveness of our proposed temporal action localization model. The model leverages action association information to enhance the accuracy of action localization, and achieves state-of-the-art results on both datasets. Ablation studies are also performed to further demonstrate the efficacy of our methods.

Published in: 2024 IEEE 7th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

Date of Conference: 15-17 March 2024

Date Added to IEEE Xplore: 25 April 2024

ISBN Information:

ISSN Information:

DOI: 10.1109/IAEAC59436.2024.10503946

Conference Location: Chongqing, China

Funding Agency:

Contents

I. Introduction

Weakly-supervised Temporal Action Localization (WTAL) aims to precisely localize the temporal boundaries of action instances and identify the appropriate action categories in untrimmed videos, with only video-level labels. Given the impracticality and high cost of obtaining detailed annotation in real-world scenarios, WTAL has become the spotlight of current research.

References is not available for this document.

Action Association Learning Network for Weakly-supervised Temporal Action Localization

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Action Association Learning Network for Weakly-supervised Temporal Action Localization

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References