I. Introduction
Recent years have witnessed thriving research activity on model predictive control (MPC) due to its broad applications in the fields of intelligent vehicles [1], [2]; microgrid operation [3]; industrial processes [4]; and so on. MPC, also called receding horizon control, is a form of model-based optimal control. The control inputs of MPC are obtained by minimizing the given cost functions and solving predictive control sequences from model-based convex programming. This optimization yields optimal/suboptimal predictive control sequences and the first control input of these sequences is applied to systems at each instant. MPC is able to handle control problems that online computation of control inputs is difficult or impossible, such as the capability of controlling multivariable systems. However, it is impossible to solve the predictive control sequence from convex programming directly without the knowledge of system models. Moreover, uncertainties of the system model and design of the cost function may lead to local optima of control sequences. Thus, finding an effective way to address these two difficulties is challenging. There are also some methods proposed to handle these difficulties, that is, Adaptive MPC [5], iterative learning control (ILC) [6], and reinforcement learning (RL) [7].