Loading [MathJax]/jax/output/HTML-CSS/fonts/TeX/SansSerif/Regular/Main.js
Analyzing the Noise Robustness of Deep Neural Networks | IEEE Journals & Magazine | IEEE Xplore

Analyzing the Noise Robustness of Deep Neural Networks


Abstract:

Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrec...Show More

Abstract:

Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples.
Published in: IEEE Transactions on Visualization and Computer Graphics ( Volume: 27, Issue: 7, 01 July 2021)
Page(s): 3289 - 3304
Date of Publication: 23 January 2020

ISSN Information:

PubMed ID: 31985427

Funding Agency:


1 Introduction

Deep neural networks (DNNs) have demonstrated superior performance in many artificial intelligence applications, such as pattern recognition and natural language processing [1], [2], [3]. However, researchers have recently found that even a highly accurate DNN can be vulnerable to carefully-crafted adversarial examples that are intentionally designed to mislead a DNN into making incorrect predictions [4], [5], [6], [7], [8]. For example, an attacker can make imperceptible modifications to a panda image (from \mathsf {{}I}_\mathsf {{}1} to \mathsf {{}I}_\mathsf {{}2} in Fig. 1) to mislead a state-of-the-art DNN model [9] to classify it as a monkey. This phenomenon creates high risk when applying DNNs to safety- and security-critical applications, such as driverless cars, face recognition ATMs, and Face ID security on mobile phones [10]. For example, researchers have recently shown that even the state-of-the-art public Face ID system can be fooled by using a carefully-crafted sticker on a hat [11]. Thus, there is an urgent need to understand the prediction process of adversarial examples and identify the root cause of incorrect predictions [10], [12]. Such an understanding is valuable for developing adversarially robust solutions [13], [14], [15]. A recent survey identifies two important questions that require analysis [10]: (1) why similar images (e.g., adversarial and normal panda images) diverge into different predictions, and (2) why images from different classes (e.g., adversarial panda images and normal monkey images) merge into the same prediction.

Contact IEEE to Subscribe

References

References is not available for this document.