1 Introduction
As the world increasingly relies on artificial intelligence (AI) for safety-critical applications in healthcare [1], autonomous driving [2], and security systems [3], the robustness of AI systems against adversarial attacks becomes paramount [4]. Before integrating Quantum Machine Learning (QML) into such high-risk areas, it is vital to develop models that can withstand adversarial environments and malicious manipulations. This need is especially urgent for Quanvolutional Neural Networks (QuNNs), whose robustness remains underexplored compared to their classical counterparts. Studies referenced in [5] and [6] highlight notable adversarial weaknesses in both classical and quantum neural networks. The findings suggest that, despite quantum neural networks demonstrating enhanced resilience and inherent robustness stemming from their unique structures, they remain susceptible to significant adversarial attacks. However, these claims lack thorough empirical validation regarding the properties of the quantum circuits. Additionally, a recent work in [7] has explored the relationship between Hilbert space dimensionality and adversarial vulnerability in quantum neural networks, demonstrating that higher dimensions can lead to decreased robustness. However, a key question remains: can properties of the Hilbert space itself be leveraged to improve the robustness of QuNNs? Moreover, the work in [8] demonstrated that gradient-based adversarial attacks are transferable from classical neural networks to quantum neural networks. However, their study did not explore the reverse transferability, where attacks originate from quantum models and impact classical systems. Additionally, the influence of quantum model architecture on this transferability was not examined in their work.