The pipeline of our proposed DAM counting scheme. Blue arrows indicate loss functions between the output and the regression target.
Abstract:
The previous counting methods trained by the density map regression scheme fail to precisely count the number of birds in crowded bird images of various scales. This is d...Show MoreMetadata
Abstract:
The previous counting methods trained by the density map regression scheme fail to precisely count the number of birds in crowded bird images of various scales. This is due to the coarseness of the manually created target density maps. In this paper, we propose a new counting scheme, called DAM counting, which generates our-first-proposed density activation map (DAM). DAM is a CNN perspective density map that has high activation values where the network focuses on for precise counting of birds. The network is trained to autonomously learn where and how much the DAM should be activated so that the sum of all values in the DAM estimates the number of birds. Moreover, our DAM counting scheme incorporates two segmentation regularizers that enable precise counting of birds with various scales and appearance. Our DAM counting scheme can effectively substitute the existing density map regression scheme, bringing in a remarkable increase of 45% in the counting accuracy. We also propose the first crowded bird dataset, called CBD-6000, which is very valuable for crowded bird counting research.
The pipeline of our proposed DAM counting scheme. Blue arrows indicate loss functions between the output and the regression target.
Published in: IEEE Access ( Volume: 8)
Funding Agency:
Figures are not available for this document.

Sample images from our proposed dataset, CBD-6000. Various scenes of occluded birds with diverse scales are included.
Figure 1 of 6

FIGURE 1.
Sample images from our proposed dataset, CBD-6000. Various scenes of occluded birds with diverse scales are included.

FIGURE 2.
(a): An input bird image. (b): Target density map constructed by placing 2D Gaussian kernels. (c): Output density map of [15], which is trained by the density map regression scheme using (b) as the target. (d): Uniform target density map. (e): Output density map of [1] when trained by the density map regression scheme using (e) as the target. (f): Density activation map generated by the same network as (e) but trained with our DAM counting scheme.

FIGURE 3.
The pipeline of our proposed DAM counting scheme. Blue arrows indicate loss functions between the output and the regression target.

FIGURE 4.
Visual comparison. (a) Input bird image. (b) Density map generated by PenNet+. The small density map in the corner is the uniform target density map that the network should target. (c) Density activation map generated by PenNet++DAM.

FIGURE 5.
Test MAE comparison for every epoch between the network trained by the density map regression scheme (black) and the network trained by our proposed DAM counting scheme (red).

FIGURE 6.
Visual results of ablation study. (a) Input bird image. (b) The output DAM when using only strong segmentation regularizer. (c) The output DAM when using only weak segmentation regularizer. (d) The fused DAM via the attention module.