Loading [MathJax]/extensions/MathMenu.js
Learning human actions via information maximization | IEEE Conference Publication | IEEE Xplore

Learning human actions via information maximization


Abstract:

In this paper, we present a novel approach for automatically learning a compact and yet discriminative appearance-based human action model. A video sequence is represente...Show More

Abstract:

In this paper, we present a novel approach for automatically learning a compact and yet discriminative appearance-based human action model. A video sequence is represented by a bag of spatiotemporal features called video-words by quantizing the extracted 3D interest points (cuboids) from the videos. Our proposed approach is able to automatically discover the optimal number of video-word clusters by utilizing Maximization of Mutual Information(MMI). Unlike the k-means algorithm, which is typically used to cluster spatiotemporal cuboids into video words based on their appearance similarity, MMI clustering further groups the video-words, which are highly correlated to some group of actions. To capture the structural information of the learnt optimal video-word clusters, we explore the correlation of the compact video-word clusters. We use the modified correlgoram, which is not only translation and rotation invariant, but also somewhat scale invariant. We extensively test our proposed approach on two publicly available challenging datasets: the KTH dataset and IXMAS multiview dataset. To the best of our knowledge, we are the first to try the bag of video-words related approach on the multiview dataset. We have obtained very impressive results on both datasets.
Date of Conference: 23-28 June 2008
Date Added to IEEE Xplore: 05 August 2008
ISBN Information:
Print ISSN: 1063-6919
Conference Location: Anchorage, AK, USA
References is not available for this document.

1. Introduction

Automatically recognizing human actions is critical for several applications such as video indexing, video summa-rization, and so on. However, it remains a challenging problem due to camera motion, occlusion, illumination changes and the individual variations of object appearance and postures.

Select All
1.
A. Bobick and J. Davis. The recognition of human movement using temporal templates, PAMI 23(3):257-267,2001.
2.
P. Dollar, V. Rabaud, G. Cottrell and S. Belongie. Behavior recognition via sparse spatio-temporal features. ICCV workshop: VS-PETS 2005.
3.
A. Efros, A. Berg, G. Mori and J. Malik. Recognizing action at a distance, ICCV 2003.
4.
C. Fanti, L. Zelnik-Manor and P. Perona. Hybrid models for human recognition, ICCV 2005.
5.
R. Fergus, L. Fei-Fei, P. Perona and A. Zisserman. Learning object Categories from Google's Image Search, ICCV 2005.
6.
Y. Ke, R. Sukthankar and M. Hebert. Efficient visual event detection using volumetric features, ICCV 2005.
7.
J. Liu and M. Shah. Scene Modeling using Co-Clustering, ICCV 2007.
8.
I. Laptev. On space-time interst points, IJCV, 64(2-3):107-123, 2005.
9.
S. Lazebnik, C. Schmid and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, CVPR 2006.
10.
S. Nowozin, G. Bakir and K. Tsuda. Discrinative subsequence mining for action classification, ICCV 2007.
11.
J. Niebles and L. Fei-Fei. A hierarchical model of shapes and appearance for human action classification, CVPR 2007.
12.
J. Niebles and L. Fei-Fei. Unsupervised learning of human action categories using spatial-temporal words, BMVC 2006.
13.
V. Parameswaran and R. Chellappa. View Invariance for Human Action Recognition, IJCV, 66(1), 2006.
14.
E. Shechtman and M. Irani. Space-time behavior based correlation, CVPR 2005.
15.
Y. Song, L. Goncalves and P. Perona. Unsupervised learning of human motion, PAMI, 25(25):1-14,2003.
16.
C. Schuldt, I. Laptev and B. Caputo. Recognizing human actions: A local svm approach, ICPR, 2004.
17.
S. Savarese, J. Winn and A. Criminisi. Discriminative Object Class Models of Appearance and Shape by Correlatons, CVPR 2006.
18.
N. Slonim and N. Tishby. Document Clustering using Word Clusters via the Information Bottleneck Method,ACM SIGIR 2000.
19.
N. Tishby, F. Perira and W. Bialek. The Information Bottleneck Method. Proc. of the 37-th Annual Allerton Conference on Communication, Control and Computing.
20.
D. Weinland, E. Boyer and R. Ronfard. Action recognition from arbitrary views using 3D examplars, ICCV 2007.
21.
J. Winn, A. Criminisi and T. Minka. Object Categorization by Learned Universal Visual Dictionary, ICCV 2005.
22.
S. Wong, T. Kim and R. Cipolla. Learning Motion Categories using both Semantics and Structural Information, CVPR 2007.
23.
A. Yilmaz and M. Shah. Action sketch: A novel Action Represetantion, CVPR 2005.
24.
F. Lv and R. Nevatia, Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching, CVPR 2007.
Contact IEEE to Subscribe

References

References is not available for this document.