ML-based malware detection on dynamic analysis reports is vulnerable to both evasion and spurious correlations. In this work, we investigate a specific ML architecture employed in the pipeline of a widelyknown commercial antivirus company, with the goal to harden it against adversarial malware. Adversarial training, the sole defensive technique that can confer empirical robustness, is not applicable out of the box in this domain, for the principal reason that gradient-based perturbations rarely map back to feasible problem-space programs. We introduce a novel Reinforcement Learning approach for constructing adversarial examples, a constituent part of adversarially training a model against evasion. Our approach comes with multiple advantages. It performs modifications that are feasible in the problem-space, and only those; thus it circumvents the inverse mapping problem. It also makes possible to provide theoretical guarantees on the robustness of the model against a particular set of adversarial capabilities. Our empirical exploration validates our theoretical insights, where we can consistently reach 0% Attack Success Rate after a few adversarial retraining iterations.
How to train your antivirus: RL-based hardening through the problem space
RAID 2024, 27th International Symposium on Research in Attacks, Intrusions and Defenses, 30 September-2 October 2024, Padua, Italy
Type:
Conférence
City:
Padua
Date:
2024-09-30
Department:
Sécurité numérique
Eurecom Ref:
7623
Copyright:
© ACM, 2024. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in RAID 2024, 27th International Symposium on Research in Attacks, Intrusions and Defenses, 30 September-2 October 2024, Padua, Italy https://doi.org/10.1145/3678890.3678912
See also:
PERMALINK : https://www.eurecom.fr/publication/7623