I. Introduction
Visual saliency, which aims to identify the most conspicuous objects or areas in an image, has intrigued great interest in recent years. It has been shown effective owing to various applications in many visual tasks, such as scene classification [1], image captioning [2] and visual tracking [3], [4]. Although visual saliency has achieved state-of-the-art performance through the use of deep convolutional neural networks (CNNs), it still suffers from two major problems in denoising the final prediction against the cluttered background and preserving the boundaries of salient objects.