Source of Publication
Applied Sciences (Switzerland)
In this paper, we propose an advanced method for adversarial training that focuses on leveraging the underlying structure of adversarial perturbation distributions. Unlike conventional adversarial training techniques that consider adversarial examples in isolation, our approach employs clustering algorithms in conjunction with dimensionality reduction techniques to group adversarial perturbations, effectively constructing a more intricate and structured feature space for model training. Our method incorporates density and boundary-aware clustering mechanisms to capture the inherent spatial relationships among adversarial examples. Furthermore, we introduce a strategy for utilizing adversarial perturbations to enhance the delineation between clusters, leading to the formation of more robust and compact clusters. To substantiate the method’s efficacy, we performed a comprehensive evaluation using well-established benchmarks, including MNIST and CIFAR-10 datasets. The performance metrics employed for the evaluation encompass the adversarial clean accuracy trade-off, demonstrating a significant improvement in both robust and standard test accuracy over traditional adversarial training methods. Through empirical experiments, we show that the proposed clustering-based adversarial training framework not only enhances the model’s robustness against a range of adversarial attacks, such as FGSM and PGD, but also improves generalization in clean data domains.
adversarial attacks, adversarial training, clustering, deep neural networks, robustness
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Rasheed, Bader; Khan, Adil; and Masood Khattak, Asad, "Structure Estimation of Adversarial Distributions for Enhancing Model Robustness: A Clustering-Based Approach" (2023). All Works. 6162.
Indexed in Scopus
Open Access Type
Gold: This publication is openly available in an open access journal/series