1. Introduction
In clinical scoliosis diagnosis, experts need to view hundreds of frames in an ultrasound sequence of a whole spine column, which is time-consuming and tedious [1]. To simplify the measurement, Volume Projection Imaging (VPI) was proposed to synthesize coronal 2D images based on the intensity of the voxels in the ultrasound sequence [2, 3]. However, owing to the fast movement of the probe and the noise in the collected spatial information, ultrasound VPI images usually suffer from a significant degradation by structured noise, which not only affects the performance of automatic pathological analysis, but also poses challenges to doctors for accurate diagnosis. As presented in Fig. 1, the structured noise, different from random noise, shows high spatial correlation, and only appears in some regions in the image. The existence of structured noise degrades the discriminative patterns in the ultrasound images, and consequently, confuses the deep network when performing classification, detection, or segmentation. VPI image restoration is an open problem, where the ground-truth data is generally inaccessible. Moreover, the structured noise varies with the ultrasound operators, probes, and empirical imaging parameters, which makes the degradation hard to model. Hence, it is also impractical to synthesize the paired noisy and noiseless samples for learning.