1. Introduction
Accurate human action classification is a fundamental problem in computer vision as well as an active field of research in recent years. However, it still remains a challenging task for computers to achieve robust action recognition due to cluttered background, camera motion, occlusion and geometric and photometric variances of the foreground per-sonts). A good example is shown in Fig. I. In this dataset, many different subjects perform the same action (e.g. walking, or hand waving) against different background (e.g. in-door, outdoor), recorded by a moving camera (e.g. zoom in and out).