Loading [MathJax]/extensions/MathMenu.js
Balancing detectability and performance of attacks on the control channel of Markov Decision Processes | IEEE Conference Publication | IEEE Xplore

Balancing detectability and performance of attacks on the control channel of Markov Decision Processes


Abstract:

We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs). This research is motivated by the r...Show More

Abstract:

We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs). This research is motivated by the recent interest of the research community for adversarial and poisoning attacks applied to MDPs, and reinforcement learning (RL) methods. The policies resulting from these methods have been shown to be vulnerable to attacks perturbing the observations of the decision-maker. In such an attack, drawing inspiration from adversarial examples used in supervised learning, the amplitude of the adversarial perturbation is limited according to some norm, with the hope that this constraint will make the attack imperceptible. However, such constraints do not grant any level of undetectability and do not take into account the dynamic nature of the underlying Markov process. In this paper, we propose a new attack formulation, based on information-theoretical quantities, that considers the objective of minimizing the detectability of the attack as well as the performance of the controlled process. We analyze the trade-off between the efficiency of the attack and its detectability. We conclude with examples and numerical simulations illustrating this trade-off.
Date of Conference: 08-10 June 2022
Date Added to IEEE Xplore: 05 September 2022
ISBN Information:

ISSN Information:

Conference Location: Atlanta, GA, USA
Division of Decision and Control Systems of the EECS School, KTH Royal Institute of Technology, Stockholm, Sweden
Division of Decision and Control Systems of the EECS School, KTH Royal Institute of Technology, Stockholm, Sweden

I. Introduction

The framework of Markov decision processes (MDPs) has been successfully applied in many types of systems control [1], [2]. Thanks to its simplicity, and generality, it is capable of modeling most of the dynamical processes. For unknown processes, reinforcement learning (RL) techniques have shown great potential in controlling unknown systems. As a matter of fact, during the last decade, we have witnessed an increased surge of interest in RL, where, by exploiting modern methods in Deep Learning [3], researchers were able to reach higher performance, sometimes surpassing human performance in games such as Go, Dota, and Atari games [4], [5], [6], [7]. RL has also been increasingly used in industrial applications, from temperature control in buildings [8], to health-care [9], financial trading [10] and more. RL-based systems are however vulnerable to AI cyber-attacks (e.g., leveraging data poisoning or adversarial samples), and as recently pointed out by Gartner and Microsoft [11], [12], only a small fraction of the companies have the right tools in place to secure their ML systems.

Division of Decision and Control Systems of the EECS School, KTH Royal Institute of Technology, Stockholm, Sweden
Division of Decision and Control Systems of the EECS School, KTH Royal Institute of Technology, Stockholm, Sweden
Contact IEEE to Subscribe

References

References is not available for this document.