I. Introduction
The ability of automatically detecting the crowd in certain environments is fundamental to various applications. The estimation of crowd density is often widely used in human safety monitoring, traffic control, smart guiding in museum and other significant applications. In order to get a precise distribution, most classical solutions are using images and video to analyze. By the steps of background modeling, changing detection, grouping and event interpretation, they can obtain the distribution of crowd density. However, these approaches are deficient in handling occlusion and crowded scenes, and the costs of complex computing also eliminate the practicability as well.