I. Introduction
Stampede, which happens frequently in big events around the world, has caused serious disasters. For example, many victims were died or injured in the fatal Shanghai Bund stampede happened in the new year celebrations of 2015. If the population density of the scene at the time could be accurately estimated and corresponding security measures were arranged in advance, such incidents might be effectively reduced or avoided. Therefore, accurate knowledge of the crowd size, crowd distribution in a public space is very necessary. With the ubiquitous installation of surveillance cameras in city and urban, crowd scene analysis from images or videos has become an important practical and research topic in computer vision community.