Abstract:
The backdoor attack has emerged as a prevalent threat that affects the effectiveness of machine learning models in intelligent vehicles. While such attacks may not impair...Show MoreMetadata
Abstract:
The backdoor attack has emerged as a prevalent threat that affects the effectiveness of machine learning models in intelligent vehicles. While such attacks may not impair the normal performance of the trained model, they can be exploited by malicious entities to manipulate model inferences, resulting in serious problems. In this paper, we design a dynamic gradient clipping (DGC) method aimed at rectifying backdoor models by eliminating the underlying backdoor trigger. Firstly, we construct a repair dataset fused by some clean samples and few-shot backdoor samples to amplify the backdoor behavior when we only obtain limited backdoor samples. Subsequently, we introduce sample states to characterize the backdoor behavior of the target model, determined by the model's inference outcome. Finally, we devise the DGC method to clip parameter gradients at varying degrees, effectively eliminating the backdoor trigger within the target model. Through the evaluation, the simulation results demonstrate that our DGC method exhibits robust defense capabilities against four contemporary state-of-the-art backdoor attacks, reducing the attack success rate by 95% with only 0.1\% \sim 4.8\% model accuracy loss.
Published in: IEEE Transactions on Dependable and Secure Computing ( Volume: 22, Issue: 1, Jan.-Feb. 2025)