Conferences >ICASSP 2023 - 2023 IEEE Inter...

Exploiting One-Class Classification Optimization Objectives for Increasing Adversarial Robustness

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This work examines the problem of increasing the robustness of deep neural network-based image classification systems to adversarial attacks, without changing the neural ...Show More

Metadata

Abstract:

This work examines the problem of increasing the robustness of deep neural network-based image classification systems to adversarial attacks, without changing the neural architecture or employ adversarial examples in the learning process. We attribute their famous lack of robustness to the geometric properties of the deep neural network embedding space, derived from standard optimization options, which allow minor changes in the intermediate activation values to trigger dramatic changes to the decision values in the final layer. To counteract this effect, we explore optimization criteria that supervise the distribution of the intermediate embedding spaces, in a class-specific basis, by introducing and leveraging one-class classification objectives. The proposed learning procedure compares favorably to recently proposed training schemes for adversarial robustness in black-box adversarial attack settings.

Published in: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 04-10 June 2023

Date Added to IEEE Xplore: 05 May 2023

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP49357.2023.10095088

Conference Location: Rhodes Island, Greece

Contents

1. INTRODUCTION

One of the most important drawbacks of the application of deep neural networks in sensitive image/video classification tasks is their limited robustness to adversarial attacks i.e., they are susceptible of being fooled by carefully crafted minor/humanly imperceptible perturbations. Adversarial attacks are methods that calculate such perturbations by exploiting the neural network backward pass to obtain gradient flow from the activations of the final (or even some intermediate) layer towards the input, using some loss function. When both the model architecture and parameters are known to the adversary, adversarial attacks are classified as white-box, while black-box/transferability attacks are devised from different host models or from the same architecture with different parameters. Up-to-date, there is a wealth of literature describing different forms of adversarial attacks that can be found in review papers [1], [2], where the reader is referred to.

References is not available for this document.

MIT Libraries

MIT Libraries

Exploiting One-Class Classification Optimization Objectives for Increasing Adversarial Robustness

Abstract:

Metadata

Abstract:

ISSN Information:

1. INTRODUCTION

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

Exploiting One-Class Classification Optimization Objectives for Increasing Adversarial Robustness

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

1. INTRODUCTION

References