Conferences >2007 IEEE Conference on Compu...

Spatio-temporal Shape and Flow Correlation for Action Recognition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper explores the use of volumetric features for action recognition. First, we propose a novel method to correlate spatio-temporal shapes to video clips that have b...Show More

Metadata

Abstract:

This paper explores the use of volumetric features for action recognition. First, we propose a novel method to correlate spatio-temporal shapes to video clips that have been automatically segmented. Our method works on over-segmented videos, which means that we do not require background subtraction for reliable object segmentation. Next, we discuss and demonstrate the complementary nature of shape- and flow-based features for action recognition. Our method, when combined with a recent flow-based correlation technique, can detect a wide range of actions in video, as demonstrated by results on a long tennis video. Although not specifically designed for whole-video classification, we also show that our method's performance is competitive with current action classification techniques on a standard video classification dataset.

Published in: 2007 IEEE Conference on Computer Vision and Pattern Recognition

Date of Conference: 17-22 June 2007

Date Added to IEEE Xplore: 16 July 2007

ISBN Information:

Print ISSN: 1063-6919

DOI: 10.1109/CVPR.2007.383512

Conference Location: Minneapolis, MN, USA

Contents

1. Introduction

The goal of action recognition is to localize a particular event of interest in video, such as a tennis serve, both in space and in time. Just as object recognition is a key problem in image understanding, action recognition is a fundamental challenge for interpreting video. A recent trend in action recognition has been the emergence of techniques based on the volumetric analysis of video, where a sequence of images is treated as a three-dimensional space-time volume. Eschewing the building of explicit models of the actor or environment (e.g., kinematic models of humans), these approaches attempt to perform recognition directly on the raw video. An obvious benefit is that recognition need not be limited to a specific set of actors or actions but can, in principle, extend to a variety of events - given appropriate training data. The drawback is that volumetric representations do not easily generalize across appearance changes due to different actors, varying environmental conditions and camera viewpoint. This observation has motivated the employment of video features that are robust to appearance; these can be broadly categorized as shape-based (e.g., background subtracted human silhouettes) and flow-based (e.g., motion fields generated using optical flow). However, as discussed below, both of these types of methods have significant limitations. Figure 1. Our goal is to detect specific actions in realisitic videos with cluttered environments. First, we segment input video into space-time volumes. Then, we correlate action templates with the volumes using shape and flow features. We are able to localize events in space-time without the need for background-subtracted videos.

References is not available for this document.

MIT Libraries

MIT Libraries

Spatio-temporal Shape and Flow Correlation for Action Recognition

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Spatio-temporal Shape and Flow Correlation for Action Recognition

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?