1. Introduction
We consider the problem of crowd counting in arbitrary static images. Given an arbitrary image of a crowded scene without any prior knowledge about the scene (e.g. camera position, scene layout, crowd density), our goal is to estimate the density map of the input image, where each pixel value in the density map corresponds to the crowd density at the corresponding location of the input image. The crowd count can be obtained by integrating the entire density map. In particular, we focus on the setting where the training data have dotted annotations, i.e. each object instance (e.g. people) is annotated with a single point in the image.