I. Introduction
There have been significant advances in the use of artificial intelligence (AI) and machine learning (ML) methods, in particular, deep-learning architectures [1], to develop models for applications with complex and high-dimensional data such as speech recognition [2], [3], biomedicine [4], and many other applications [5], [6]. Reinforcement learning techniques have also been used to develop automatic controllers that exhibit human-level performance in video games [7], navi-gation controllers within virtual environments [8], and many robotic applications [9]. Often these techniques represent end-to-end methods in which raw sensory information (such as pixel values) or state estimates are used to directly generate control outputs. Although these learning techniques are very effective for applications like those referenced above, they remain difficult to apply to the control of other classes of complex, real-world, cyber-physical systems.