I. Introduction
Integrated artificial intelligence (AI) and communication has been recognized as one of the six usage scenarios of 6G by IMT-2030 [1]. One fundamental function of this usage scenario is providing ubiquitous intelligent services at the network edge [2], [3], [4], [5]. To implement these intelligent services, it is desirable to deploy well-trained AI models at the network edge, giving rise to a research area called edge AI inference or edge inference [6], [7], [8].