I. Introduction
Recent advances in machine learning have sparked unique opportunities for building artificial intelligence on graphs, which is a versatile data structure for representing non-sequential data of discrete nature. As illustrated by Figure 1, a distinction of graph-based discrete data from vector-based discretizable data is that the former consists of indivisible elements that must be inserted or withdrawn atomically. In contrast, the latter consist of discretized samples drawn from a continuous signal at tunable resolutions. Consequently, graph data does not trivially permit interpolation, convolution, and inner product, which are the operations commonly used in feature extraction. As a result, special care must be taken to generalize machine learning algorithms that operate on fixed-length feature vectors and uniform grids to their graph-based counterparts.