Loading [MathJax]/extensions/MathZoom.js
Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees | IEEE Journals & Magazine | IEEE Xplore

Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees


Abstract:

A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible act...Show More

Abstract:

A shape-motion prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, an action prototype tree is learned in a joint shape and motion space via hierarchical K-means clustering and each training sequence is represented as a labeled prototype sequence; then a look-up table of prototype-to-prototype distances is generated. During testing, based on a joint probability model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint probability, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance measures used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 92.86 percent on a large gesture data set (with dynamic backgrounds), 100 percent on the Weizmann action data set, 95.77 percent on the KTH action data set, 88 percent on the UCF sports data set, and 87.27 percent on the CMU action data set.
Page(s): 533 - 547
Date of Publication: 23 January 2012

ISSN Information:

PubMed ID: 21788666
References is not available for this document.

1 Introduction

Action recognition is receiving more and more attention in computer vision due to its potential applications such as video surveillance, human-computer interaction, virtual reality, and multimedia retrieval. Descriptor matching and classification-based schemes have been common for action recognition. However, for large-scale action retrieval and recognition where the training database consists of thousands of action videos, such a matching scheme may require tremendous amounts of computation. Recognizing actions viewed against a dynamic varying background is another important challenge. Many studies have been performed on effective feature extraction and categorization methods for robust action recognition. Detailed surveys were reported in [1], [2], [3].

Select All
1.
T.B. Moeslund, A. Hilton and V. Kruger, "A Survey of Adances in Vision-Based Human Motion Capture and Analysis", Computer Vision and Image Understanding, vol. 104, no. 2, pp. 90-126, 2006.
2.
P. Turaga, R. Chellappa, V.S. Subrahmanian and O. Udrea, "Machine Recogntion of Human Activities: A Survey", IEEE Trans. Circuits and Systems for Video Technology, vol. 11, no. 8, pp. 1473-1488, Nov. 2008.
3.
R. Poppe, "A Survey on Vision-Based Human Action Recognition", Image and Vision Computing, vol. 28, no. 6, pp. 976-990, 2010.
4.
H. Li and M. Greenspan, "Multi-Scale Gesture Recognition from Time-Varying Contours", Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 236-243, 2005.
5.
Y. Shen and H. Foroosh, "View-Invariant Action Recognition Using Fundamental Ratios", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-6, 2008.
6.
P. Natarajan, V. Singh and R. Nevatia, "Learning 3D Action Models from a Few 2D Videos for View Invariant Action Recognition", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2010.
7.
A. Efros, A. Berg, G. Mori and J. Malik, "Recognizing Action at a Distance", Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 726-733, 2003.
8.
G.R. Bradski and J.W. Davis, "Motion Segmentation and Pose Recognition with Motion History Gradients", Machine Vision and Applications, vol. 13, pp. 174-184, 2002.
9.
A. Fathi and G. Mori, "Action Recognition by Learning Mid-Level Motion Features", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
10.
Y. Wang, P. Sabzmeydani and G. Mori, "Semi-Latent Dirichlet Allocation: A Hierarchical Model for Human Action Recognition", Proc. IEEE Int'l Conf. Computer Vision Workshop Human Motion Understanding Modeling Capture and Animation, pp. 240-254, 2007.
11.
A. Elgammal, V. Shet, Y. Yacoob and L.S. Davis, "Learning Dynamics for Exemplar-Based Gesture Recognition", Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 571-578, 2003.
12.
C. Thurau and V. Hlavac, "Pose Primitive Based Human Action Recognition in Videos or Still Images", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
13.
C. Schuldt, I. Laptev and B. Caputo, "Recognizing Human Actions: A Local SVM Approach", Proc. Int'l Conf. Pattern Recognition, vol. 3, pp. 32-36, 2004.
14.
I. Laptev and P. Perez, "Retrieving Actions in Movies", Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
15.
I. Laptev, M. Marszalek, C. Schmid and B. Rozenfeld, "Learning Realistic Human Actions from Movies", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
16.
E. Shechtman and M. Irani, "Space-Time Behavior-Based Correlation", IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 11, pp. 2045-2056, Nov. 2007.
17.
M. Blank, L. Gorelick, E. Shechtman, M. Irani and R. Basri, "Actions as Space-Time Shapes", Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 1395-1402, 2005.
18.
J.C. Niebles, H. Wang and L. Fei-Fei, "Unsupervised Learning of Human Action Categories Using Spatial-temporal Words", Int'l J. Computer Vision, vol. 79, no. 3, pp. 299-318, 2007.
19.
Y. Ke, R. Sukthankar and M. Hebert, "Spatio-Temporal Shape and Flow Correlation for Action Recognition", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
20.
P. Dollar, V. Rabaud, G. Cottrell and S. Belongie, "Behavior Recognition via Sparse Spatio-Temporal Features", Proc. Int'l Workshop Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65-72, 2005.
21.
J. Liu and M. Shah, "Learning Human Actions via Information Maximization", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
22.
Y. Ke, R. Sukthankar and M. Hebert, "Event Detection in Crowded Videos", Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
23.
S. Nowozin, G. Bakir and K. Tsuda, "Discriminative Subsequence Mining for Action Classification", Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
24.
M.D. Rodriguez, J. Ahmed and M. Shah, "Action Mach: A Spatio-Temporal Maximum Average Correlation Height Filter for Action Recogntion", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
25.
S. Wong and R. Cipolla, "Extracting Spatiotemporal Interest Points Using Global Information", Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
26.
J. Liu, J. Luo and M. Shah, "Recognizing Realistic Actions from Videos in the Wild", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2009.
27.
J. Yuan, Z. Liu and Y. Wu, "Discriminative Subvolume Search for Efficient Action Detection", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2009.
28.
A. Kovashka and K. Grauman, "Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2010.
29.
H. Jhuang, T. Serre, L. Wolf and T. Poggio, "A Biologically Inspired System for Action Recognition", Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
30.
J. Liu, S. Ali and M. Shah, "Recognizing Human Actions Using Multiple Features", Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
Contact IEEE to Subscribe

References

References is not available for this document.