Journals & Magazines >IEEE Transactions on Artifici... >Volume: 6 Issue: 3

Regret and Belief Complexity Tradeoff in Gaussian Process Bandits via Information Thresholding

Download PDF
Download References
Request Permissions
Save to
Alerts

Impact Statement:We cast the Bayesian optimization problem as a multiarm bandit problem and solve using GP-UCB. This framework is popular and finds applications in different domains such ...Show More

Abstract:

Bayesian optimization is a powerful framework for global search, using maximum a posteriori updates instead of simulated annealing. We cast it as a multiarmed bandit prob...Show More

Metadata

Impact Statement:

We cast the Bayesian optimization problem as a multiarm bandit problem and solve using GP-UCB. This framework is popular and finds applications in different domains such as robotics, and DNN training. But there exists a fundamental issue of scalability with respect to estimating the belief model, which limits the applicability of such techniques to practical scenarios. Therefore, we propose a novel compression mechanism via information thresholding in this work for GP-UCB and establish theoretical tradeoffs introduced due to the proposed novel compression scheme. This work could really impact the practical applicability of the GP-UCB algorithms. Our experiments also support the claims of theoretical tradeoffs we derive.

Abstract:

Bayesian optimization is a powerful framework for global search, using maximum a posteriori updates instead of simulated annealing. We cast it as a multiarmed bandit problem with a Gaussian process (GP) for the payoff function. Action selections rely on upper confidence bound (UCB) or expected improvement (EI). Prior works with GPs faced challenges for large iteration horizons (

$T$ ) due to cubic scaling in posterior computation. To address this, we propose a simple thresholding: incorporating an action into the GP posterior only when its conditional entropy surpasses

$\epsilon$ . Doing so permits us to precisely characterize the tradeoff between regret bounds of GP bandit algorithms and complexity of the posterior distributions depending on the compression parameter

$\epsilon$ for both discrete and continuous action sets. To best of our knowledge, this is the first result which allows us to obtain sublinear regret bounds while still maintaining sublinear growth rate of the complexity of...

Published in: IEEE Transactions on Artificial Intelligence ( Volume: 6, Issue: 3, March 2025)

Page(s): 508 - 517

Date of Publication: 13 November 2023

Electronic ISSN: 2691-4581

DOI: 10.1109/TAI.2023.3332023

Contents

I. Introduction

Bayesian optimization is a framework for global optimization of a black box function via noisy evaluations [2] and provides an alternative to simulated annealing [3], [4] or exhaustive search [5]. These methods have proven adept at hyper-parameter tuning of machine learning models [6], [7], nonlinear system identification [8], experimental design [9], [10], and semantic mapping [11]. More specifically, it denote the function we seek to optimize through noisy samples, i.e., for a given choice , we observe sequentially. We make no assumptions for now on the convexity, smoothness, or other properties of , other than each function evaluation must be selected judiciously. Our goal is to select a sequence of actions that eventuate in competitive performance with respect to the optimal selection . For sequential decision making, a canonical performance metric is regret, which quantifies the performance of a sequence of decisions as compared with the optimal action : \begin{align*} \textbf{Reg}_{T}:=\sum_{t=1}^{T}(f({\mathbf{x}}^{*})-f({\mathbf{x}}_{t})).\tag{I.1} \end{align*}

References is not available for this document.

Regret and Belief Complexity Tradeoff in Gaussian Process Bandits via Information Thresholding

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Regret and Belief Complexity Tradeoff in Gaussian Process Bandits via Information Thresholding

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

Authors

Figures

References

Keywords

Metrics

Supplemental Items

Footnotes

References