I. Introduction
Point clouds have recently been recognized as a significant media format, because it reveals the essential geometry structure of three dimensional (3D) scenes and has potentials in a wide variety of applications. These applications can be classified into two categories in terms of the ultimate utility and receptors. The first one is human-oriented applications such as virtual reality (VR), augmented reality (AR), culture heritage etc., which intend to provide immersive, photo-realistic and interactive viewing experiences for human eyes. One recent representative work in this category uses point clouds as the representation of view-dependent light fields with unlimited six-degrees-of-freedom [1], [2]. The second category is machine-oriented applications such as autonomous navigation for vehicles or drones, where the ultimate receptors in this category are machines rather than human eyes. Machine-oriented applications focus on object detection, object segmentation, localization and other computer vision related tasks, in which great progresses have been witnessed recently on manipulating 3D point clouds by deep learning based methods with neural networks [3]–[6].