Multi-objective differential evolution in the generation of adversarial examples

Abstract

Adversarial examples remain a critical concern for the robustness of deep learning models, showcasing vulnerabilities to subtle input manipulations. While earlier research focused on generating such examples using white-box strategies, later research focused on gradient-based black-box strategies, as models’ internals often are not accessible to external attackers. This paper extends our prior work by exploring a gradient-free search-based algorithm for adversarial example generation, with particular emphasis on differential evolution (DE). Building on top of the classic DE operators, we propose five variants of gradient-free algorithms: a single-objective approach (DE), two multi-objective variations (NSGA-II-DE and MOEA/D-DE), and two many-objective strategies (NSGA-II-DE and AGE-MOEA-DE). Our study on five canonical image classification models shows that whilst DE variant remains the fastest approach, NSGA-II-DE consistently produces more minimal adversarial attacks (i.e., with fewer image perturbations). Moreover, we found that applying a post-process minimization to our adversarial images, would further reduce the number of changes and overall delta variation (image noise).

Type
Publication
Science of Computer Programming
Annibale Panichella
Annibale Panichella
Associate Professor in Software Engineering

My research interests include software testing, SE for AI, SE for blockchain, and cyber-physical systems