I. Introduction
In recent years, China’s grape planting area and output have increased. Traditional vineyard management has begun to change to a more efficient and intelligent way. Grape automatic yield estimation is one of the important links to realize intelligent management of vineyards. At the same time, it is of great significance to reduce production costs and improve agricultural efficiency. Grape berry count is a key step in yield estimation process. Currently, the counting of grape berries still relies mainly on traditional manual observation methods. This method has low efficiency, strong subjectivity and high labor cost. Some efforts have been made to automate the counting of grape berries [1], [2]. However, it has been studied in a single context, with a high degree of individual differentiability and a low number of grape bunches. It is just as inadequate for rapid estimation and large-scale grape yield estimation in complex field environments [1]. Therefore, field grape berry counting needs a gradual shift from tedious, limiting working methods to low-cost, efficient and robust approaches. The computer vision technology and deep learning get quickly development in the decades, object counting method based on computer vision is expected to become an effective means of field grape counting automation.