Needles in a haystack: Mining information from public dynamic analysis sandboxes for malware intelligence

Graziano, Mariano; Canali, Davide; Bilge, Leyla; Lanzi, Andrea; Balzarotti, Davide
USENIX 2015, 24th Usenix Security Symposium, August 12-14, 2015, Washington DC, USA

Malware sandboxes are automated dynamic analysis systems that execute programs in a controlled environment. Within the large volumes of samples submitted every day to these services, some submissions appear to be different from others, and show interesting characteristics. For example, we observed that malware samples involved in famous targeted attacks - like the Regin APT framework or the recently disclosed malwares from the Equation Group - were submitted to our sandbox months or even years before they were detected in the wild. In other cases, the malware developers themselves interact with public sandboxes to test their creations or to develop a new evasion technique. We refer to similar cases as malware developments. In this paper, we propose a novel methodology to automatically identify malware development cases from the samples submitted to a malware analysis sandbox. The results of our experiments show that, by combining dynamic and static analysis with features based on the file submission, it is possible to achieve a good accuracy in automatically identifying cases of malware development. Our goal is to raise awareness on this problem and on the importance of looking at these samples from an intelligence and threat prevention point of view. 

Sécurité numérique
Eurecom Ref:
Copyright Usenix. Personal use of this material is permitted. The definitive version of this paper was published in USENIX 2015, 24th Usenix Security Symposium, August 12-14, 2015, Washington DC, USA and is available at :