Loading [MathJax]/extensions/MathMenu.js
Scale-Context Perceptive Network for Crowd Counting and Localization in Smart City System | IEEE Journals & Magazine | IEEE Xplore

Scale-Context Perceptive Network for Crowd Counting and Localization in Smart City System


Abstract:

The task of crowd counting and localization is to predict the count and position of people in a crowd, which is a practical and essential subtask in crowd analysis and sm...Show More

Abstract:

The task of crowd counting and localization is to predict the count and position of people in a crowd, which is a practical and essential subtask in crowd analysis and smart city systems. However, the inherent problems of scale variation and background disturbance restrain their performance. While recent research focus on studying counting and localization independently, a few works are capable of executing both tasks simultaneously. To this end, we propose a scale-context perceptive network (SCPNet) to jointly tackle the crowd counting and localization tasks in a unified framework. Specifically, a scale perceptive (SP) module with a local–global branch schema is designed to capture multiscale information. Meanwhile, a context perceptive (CP) module, by the channel-spatial self-attention mechanism, is derived to suppress the background disturbance. Furthermore, a novel hierarchical scale loss function that combines the Euclidean loss function and structural similarity loss function is designed to prompt the proposed model to fulfill the counting and localization simultaneously. Extensive experiments on challenging crowd data sets prove the superiority of the proposed SCPNet compared with the state-of-the-art competitors in both objective and subjective evaluations.
Published in: IEEE Internet of Things Journal ( Volume: 10, Issue: 21, 01 November 2023)
Page(s): 18930 - 18940
Date of Publication: 19 April 2023

ISSN Information:

Funding Agency:

References is not available for this document.

I. Introduction

Crowd analysis is an emerging topic in computer vision, and a crucial task in smart city applications, e.g., video monitoring, urban planning, and public security [1], [2]. It has two essential subtasks, namely, counting and localization, that have drawn signification attention in recent years. The objectives are to infer pedestrian numbers and locations, respectively. The approach for crowd counting is constantly refreshed and increasingly more effective. Meanwhile, crowd localization, evolved from crowd counting, is gradually explored and developed. They can be served for high-level vision tasks, e.g., crowd tracking [3] and 3-D human pose estimation [4].

Select All
1.
G. Gao, J. Gao, Q. Liu, Q. Wang and Y. Wang, "CNN-based density estimation and crowd counting: A survey", arXiv:2003.12783, 2020.
2.
Z. Fan, H. Zhang, Z. Zhang, G. Lu, Y. Zhang and Y. Wang, "A survey of crowd counting and density estimation based on convolutional neural network", Neurocomputing, vol. 472, pp. 224-251, Feb. 2022, [online] Available: https://doi.org/10.1016/j.neucom.2021.02.103.
3.
R. Sundararaman, C. Braga, É. Marchand and J. Pettré, "Tracking pedestrian heads in dense crowd", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3864-3874, 2021, [online] Available: https://doi.org/10.1109/CVPR46437.2021.00386.
4.
C. Zheng, S. Zhu, M. Mendieta, T. Yang, C. Chen and Z. Ding, "3D human pose estimation with spatial and temporal transformers", Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp. 11636-11645, 2021, [online] Available: https://doi.org/10.1109/ICCV48922.2021.01145.
5.
A. de Santana Correia and E. Colombini, "Attention please! A survey of neural attention models in deep learning", arXiv:2103.16775, 2021.
6.
Y. Hu et al., "NAS-count: Counting-by-density with neural architecture search", Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 747-766, 2020, [online] Available: https://doi.org/10.1007/978-3-030-58542-6_45.
7.
Y. Li, X. Zhang and D. Chen, "CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 1091-1100, 2018, [online] Available: https://doi.org/10.1109/CVPR.2018.00120.
8.
V. A. Sindagi and V. M. Patel, "HA-CCN: Hierarchical attention-based crowd counting network", IEEE Trans. Image Process., vol. 29, pp. 323-335, 2020, [online] Available: https://doi.org/10.1109/TIP.2019.2928634.
9.
D. Liang, W. Xu, Y. Zhu and Y. Zhou, "Focal inverse distance transform maps for crowd localization", IEEE Trans. Multimedia, Sep. 2022.
10.
J. Cheng, H. Xiong, Z. Cao and H. Lu, "Decoupled two-stage crowd counting and beyond", IEEE Trans. Image Process., vol. 30, pp. 2862-2875, 2021, [online] Available: https://doi.org/10.1109/TIP.2021.3055631.
11.
W. Zhai et al., "DA2Net: A dual attention-aware network for robust crowd counting", Multimedia Syst., [online] Available: https://doi.org/10.1007/s00530-021-00877-4.
12.
Y. Zhang, D. Zhou, S. Chen, S. Gao and Y. Ma, "Single-image crowd counting via multi-column convolutional neural network", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 589-597, 2016, [online] Available: https://doi.org/10.1109/CVPR.2016.70.
13.
J. Gao, Q. Wang and Y. Yuan, "SCAR: Spatial-/channel-wise attention regression networks for crowd counting", Neurocomputing, vol. 363, pp. 1-8, Oct. 2019, [online] Available: https://doi.org/10.1016/j.neucom.2019.08.018.
14.
N. Liu, Y. Long, C. Zou, Q. Niu, L. Pan and H. Wu, "ADCrowdNet: An attention-injective deformable convolutional network for crowd understanding", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3220-3229, 2019, [online] Available: https://doi.org/10.1109/CVPR.2019.00334.
15.
W. Zhai et al., "An attentive hierarchy ConvNet for crowd counting in smart city", Clust. Comput., vol. 26, pp. 1099-1111, Sep. 2022, [online] Available: https://doi.org/10.1007/s10586-022-03749-2.
16.
D. B. Sam, S. Surya and R. V. Babu, "Switching convolutional neural network for crowd counting", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 4031-4039, 2017, [online] Available: https://doi.org/10.1109/CVPR.2017.429.
17.
X. Cao, Z. Wang, Y. Zhao and F. Su, "Scale aggregation network for accurate and efficient crowd counting", Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 734-750, 2018, [online] Available: https://doi.org/10.1007/978-3-030-01228-1_45.
18.
M.-H. Guo et al., "Attention mechanisms in computer vision: A survey", arXiv:2111.07624, 2022.
19.
X. Guo, M. Gao, W. Zhai, Q. Li, K. H. Kim and G. Jeon, "Dense attention fusion network for object counting in IoT system", Mobile Netw. Appl., [online] Available: https://doi.org/10.1007/s11036-023-02090-1.
20.
X. Guo, M. Anisetti, M. Gao and G. Jeon, "Object counting in remote sensing via triple attention and scale-aware network", Remote Sens., vol. 14, no. 24, pp. 6363, 2022, [online] Available: https://www.mdpi.com/2072-4292/14/24/6363.
21.
W. Zhai, M. Gao, M. Anisetti, Q. Li, S. Jeon and J. Pan, "Group-split attention network for crowd counting", J. Electron. Imag., vol. 31, no. 4, 2022, [online] Available: https://doi.org/10.1117/1.JEI.31.4.041214.
22.
Y. Miao, Z. Lin, G. Ding and J. Han, "Shallow feature based dense attention network for crowd counting", Proc. AAAI Conf. Artif. Intell. (AAAI), vol. 34, pp. 11765-11772, 2020, [online] Available: https://doi.org/10.1609/AAAI.V34I07.6848.
23.
X. Guo, M. Gao, W. Zhai, J. Shang and Q. Li, "Spatial-frequency attention network for crowd counting", Big Data, vol. 10, no. 5, pp. 453-465, 2022, [online] Available: https://doi.org/10.1089/big.2022.0039.
24.
H. Lin, Z. Ma, R. Ji, Y. Wang and X. Hong, "Boosting crowd counting via multifaceted attention", Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 19628-19637, 2022, [online] Available: https://doi.org/10.1109/CVPR52688.2022.01901.
25.
H. Idrees et al., "Composition loss for counting density map estimation and localization in dense crowds", Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 532-546, 2018, [online] Available: https://doi.org/10.1007/978-3-030-01216-8_33.
26.
C. Liu, X. Weng and Y. Mu, "Recurrent attentive zooming for joint crowd counting and precise localization", Proc. CVPR, pp. 1217-1226, 2019, [online] Available: https://doi.org/10.1109/CVPR.2019.00131.
27.
D. B. Sam, S. V. Peri, M. N. Sundararaman, A. Kamath and R. V. Babu, "Locate size and count: Accurately resolving people in dense crowds via detection", IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 8, pp. 2739-2751, Aug. 2021, [online] Available: https://doi.org/10.1109/tpami.2020.2974830.
28.
S. Abousamra, M. Hoai, D. Samaras and C. Chen, "Localization in the crowd with topological constraints", Proc. AAAI Conf. Artif. Intell. (AAAI), pp. 872-881, 2021, [online] Available: https://doi.org/10.1609/aaai.v35i2.16170.
29.
D. Liang, W. Xu and X. Bai, "An end-to-end transformer model for crowd localization", Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 1-17, 2022.
30.
J. Wang et al., "Deep high-resolution representation learning for visual recognition", IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 10, pp. 3349-3364, Oct. 2021, [online] Available: https://doi.org/10.1109/TPAMI.2020.2983686.

Contact IEEE to Subscribe

References

References is not available for this document.