Revisiting Attack-Caused Structural Distribution Shift in Graph Anomaly Detection | IEEE Journals & Magazine | IEEE Xplore

Revisiting Attack-Caused Structural Distribution Shift in Graph Anomaly Detection


Abstract:

Graph anomaly detection (GAD) under semi-supervised setting poses a significant challenge due to the distinct structural distribution between anomalous and normal nodes. ...Show More

Abstract:

Graph anomaly detection (GAD) under semi-supervised setting poses a significant challenge due to the distinct structural distribution between anomalous and normal nodes. Specifically, anomalous nodes constitute a minority and exhibit high heterophily and low homophily compared to normal nodes. The distribution of neighbors of the two types of nodes is close, making them difficult to distinguish during aggregation. Furthermore, we discover that apart from various time factors and annotation preferences, graph adversarial attacks can amplify the heterophily difference across training and testing data, namely distribution shift (SDS) in this paper. Current methods for GAD tend to overlook SDS, resulting in poor generalization and limited effectiveness. This work solves the problem from a feature view. We observe that the degree of SDS varies between anomalies and normal nodes. Hence the key lies in (1) resisting high heterophily for anomalies and (2) benefiting the learning of normals from homophily. To this end, we design a Graph Decomposition Network (GDN), which not only teases out the anomaly features that make great contributions to GAD to mitigate the effect of heterophilous neighbors and make them invariant, but also constrain the remaining features for normal nodes to preserve the connectivity of nodes and reinforce the influence of the homophilous neighborhood. To further validate the effectiveness of our method, we illustrate the feature decomposition process in spectral domain, and we also conduct an adversarial attack to incur different heterophily degrees under SDS. Extensive experimental results demonstrate that our framework achieves both accuracy and robustness enhancement.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 36, Issue: 9, September 2024)
Page(s): 4849 - 4861
Date of Publication: 25 March 2024

ISSN Information:

Funding Agency:


I. Introduction

Anomalies (aka. fraudsters) delineate the abnormal objects that deviate significantly from the normal (aka. benign) [1]. This issue has garnered considerable attention in various domains, including distinguishing fake reviews [2] or misinformation in social networks [3], [4], and detecting fraudulent behavior in financial transactions [5]. Generally, abnormal and normal objects are intricately connected through complex relationships, which can be effectively represented as graphs [2]. Wherein, nodes represent these objects, and edges interpret their relationships. To address the GAD problem on such structural data, many state-of-the-art methods [2], [5], [6], [7] adopt a semi-supervised node classification approach, where only a subset of nodes is labeled as training data, while the remainder is employed as the testing data. To distill the discriminative information for the hidden anomalies, these methods mostly apply graph neural networks (GNNs) [8], [9], [10] that propagate the label-aware signals along with the graph structure.

References

References is not available for this document.