1 Introduction
Graph embedding aims to learn a low-dimensional representation for each node in a given graph, on which subsequent inferences can be directly performed. Recently, graph neural networks have emerged an attractive graph embedding solution due to the representation power: they can learn nonlinear mappings from the graph space to the embedding space, while traditional methods such as matrix factorization are restricted to linear mappings [1]. On the other hand, powerful graph neural networks can fail if their architectures do not align with the downstream tasks [2].