I. Introduction
Bayesian optimization is a framework for global optimization of a black box function via noisy evaluations [2] and provides an alternative to simulated annealing [3], [4] or exhaustive search [5]. These methods have proven adept at hyper-parameter tuning of machine learning models [6], [7], nonlinear system identification [8], experimental design [9], [10], and semantic mapping [11]. More specifically, it denote the function we seek to optimize through noisy samples, i.e., for a given choice , we observe sequentially. We make no assumptions for now on the convexity, smoothness, or other properties of , other than each function evaluation must be selected judiciously. Our goal is to select a sequence of actions that eventuate in competitive performance with respect to the optimal selection . For sequential decision making, a canonical performance metric is regret, which quantifies the performance of a sequence of decisions as compared with the optimal action : \begin{align*} \textbf{Reg}_{T}:=\sum_{t=1}^{T}(f({\mathbf{x}}^{*})-f({\mathbf{x}}_{t})).\tag{I.1} \end{align*}