I. Introduction
Click-Through Rate(CTR) prediction plays an essential role in advertising industry, which a lot of researchers pay much attention to in the past decade. The data in the sphere of CTR are typically multi -categorical feature expression [1]–[4]. The general way to cope with multi -categorical data is transforming the data into binary features via one-hot or multi-hot encoding. Recently, deep learning based methods have been blooming [1]–[3], [5], [6], which utilized embedding layer to map large-scale and sparse input features into low-dimensional and dense embedding vectors. They all followed the structure named Embedding&MLP.