1. Introduction
Human detection in video footage is an important task in many applications, including video surveillance in dynamic scenes, driving assistance system, content-based retrieval, etc. Effective algorithms have been developed for human detection in video captured by conventional cameras [1]. For video surveillance, fish-eye cameras are often used because they can cover a large region using a single camera. Because of the special characteristics of video frames captured by fish-eye cameras, human detection in fish-eye video remains an open challenge. Some previous works [2] [3] depend on first knowing intrinsic or extrinsic parameters of the fish-eye camera, then using these parameters to support their detection process. Satio et al. use the geometric relations to calculate the rough height of human in certain place [4] and use this information to guide human detection. Their algorithm is further limited to the application where people only walk through certain regions in the surveyed area. Other approaches first warp the fisheye view to a normal view and then apply human detection algorithms for normal views. This approach suffers from the inaccuracy in camera calibration and the distortion from the warping process.