Loading [MathJax]/extensions/MathMenu.js
Crowd Counting via Multi-view Scale Aggregation Networks | IEEE Conference Publication | IEEE Xplore

Crowd Counting via Multi-view Scale Aggregation Networks


Abstract:

Crowd counting, aiming at estimating the total number of people in unconstrained crowded scenes, has increasingly received attention. But it is greatly challenged by the ...Show More

Abstract:

Crowd counting, aiming at estimating the total number of people in unconstrained crowded scenes, has increasingly received attention. But it is greatly challenged by the huge variation in people scale. In this paper, we propose a novel Multi-View Scale Aggregation Network (MVSAN), which handle the scale variation from feature, input and criterion view comprehensively. Firstly, we design a simple but effective Multi-Scale Feature Encoder, which exploits dilated convolution layers with various dilation rates to improve the representation ability and scale diversity of features. Secondly, we feed multiple scales of input images into networks to generate high-quality density maps in a coarse-to-fine manner. Finally, we propose a Multi-Scale Structural Similarity loss to force our networks to learn the local correlation of density maps. Extensive experiments on two standard benchmarks show that the proposed method can generate high-quality crowd density map and accurate count estimation, outperforming the state-of-the-art methods with a large margin.
Date of Conference: 08-12 July 2019
Date Added to IEEE Xplore: 05 August 2019
ISBN Information:

ISSN Information:

Conference Location: Shanghai, China
References is not available for this document.

1. Introduction

With the rapid growth of the urban population, public safety has become a great challenge in city management. Most safety control measures relied on crowd counting, which estimates the crowd number from images and surveillance videos. However, the large scale variance of people from massive street images from social networks and real-time surveillance, is still one of the main obstacles for accurate estimation.

Select All
1.
Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao and Yi Ma, "Single-image crowd counting via multi-column convolutional neural network", CVPR, 2016.
2.
Vishwanath A Sindagi and Vishal M Patel, "Generating high-quality crowd density maps using contextual pyramid cnns", ICCV, 2017.
3.
Yuhong Li, Xiaofan Zhang and Deming Chen, "Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes", CVPR, 2018.
4.
Z. Wang, E. P. Simoncelli and A. C. Bovik, "Multiscale structural similarity for image quality assessment", ACSSC, 2003.
5.
Shanghang Zhang, Guanhang Wu, Joao P Costeira and José MF Moura, "Fcn-rlstm: Deep spatio-temporal neural networks for vehicle counting in city cameras", CVPR, 2017.
6.
Zhilin Qiu, Lingbo Liu, Guanbin Li, Qing Wang, Nong Xiao and Liang Lin, "Taxi origin-destination demand prediction with contextualized spatial-temporal network", ICME, 2019.
7.
Lingbo Liu, Ruimao Zhang, Jiefeng Peng, Guanbin Li, Du Bowen and Liang Lin, "Attentive crowd flow machines", ACM MM, 2018.
8.
Deepak Babu Sam, Shiv Surya and R Venkatesh Babu, "Switching convolutional neural network for crowd counting", CVPR, 2017.
9.
Karen Simonyan and Andrew Zisserman, "Very deep convolutional networks for large-scale image recognition", arXiv preprint arXiv: 1409.1556, 2014.
10.
Zan Shen, Yi Xu, Bingbing Ni, Minsi Wang, Jianguo Hu and Xiaokang Yang, "Crowd counting via adversarial cross-scale consistency pursuit", CVPR, 2018.
11.
Deepak Babu Sam, Neeraj N Sajjan, R Venkatesh Babu and Mukundhan Srinivasan, "Divide and grow: Capturing huge diversity in crowd images with incrementally growing cnn", CVPR, 2018.
12.
Daniel Onoro-Rubio and Roberto J López-Sastre, "Towards perspective-free object counting with deep learning", ECCV, 2016.
13.
Xialei Liu, Joost van deWeijer and Andrew D Bagdanov, "Leveraging unlabeled data for crowd counting by learning to rank", arXiv preprint arXiv:1803.03095, 2018.
14.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, et al., "Generative adversarial nets", NIPS, 2014.
15.
Justin Johnson, Alexandre Alahi and Li Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution", ECCV, 2016.
16.
Xinkun Cao, Zhipeng Wang, Yanyun Zhao and Fei Su, "Scale aggregation network for accurate and efficient crowd counting", ECCV, 2018.
17.
Lingbo Liu, Hongjun Wang, Guanbin Li, Wanli Ouyang and Liang Lin, "Crowd counting using deep recurrent spatial-aware network", IJCAI, 2018.
18.
Fisher Yu and Vladlen Koltun, "Multi-scale context aggregation by dilated convolutions", arXiv preprint arXiv:1511.07122, 2015.
19.
Chen-Yu Lee, Saining Xie, Patrick Gallagher, Zhengyou Zhang and Zhuowen Tu, "Deeply-supervised nets", Artificial Intelligence and Statistics, 2015.
20.
Zhou Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity", TIP, 2004.
21.
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, et al., "Automatic differentiation in pytorch", 2017.
22.
J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "Imagenet: A large-scale hierarchical image database", CVPR, 2009.
23.
Diederik P Kingma and Jimmy Ba, "Adam: A method for stochastic optimization", arXiv preprint arXiv:1412.6980, 2014.
24.
Quan Huynh-Thu and Mohammed Ghanbari, "Scope of validity of psnr in image/video quality assessment", Electronics letters, 2008.
25.
Vishwanath A Sindagi and Vishal M Patel, "Cnn-based cascaded multitask learning of high-level prior and density estimation for crowd counting", AVSS, 2017.
26.
H. Idrees, I. Saleemi, C. Seibert and M. Shah, "Multi-source multiscale counting in extremely dense crowd images", CVPR, 2013.
27.
Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun, "Deep residual learning for image recognition", CVPR, 2016.
28.
Gao Huang, Zhuang Liu, Laurens Van Der Maaten and Kilian QWeinberger, "Densely connected convolutional networks", CVPR, 2017.
29.
Haroon Idrees, Muhmmad Tayyab, Kishan Athrey, Dong Zhang, Somaya Al-Maadeed, Nasir Rajpoot, et al., "Composition loss for counting density map estimation and localization in dense crowds", arXiv preprint arXiv::1808.01050, 2018.

Contact IEEE to Subscribe

References

References is not available for this document.