Lost in the loader: The many faces of the windows PE file format

Nisi, Dario; Graziano, Mariano; Fratantonio, Yanick; Balzarotti, Davide

RAID 2021, 24th International Symposium on Research in Attacks, Intrusions and Defenses, 6-8 october 2021, Donostia/San Sebastian, Spain

A known problem in the security industry is that programs that deal with executable file formats, such as OS loaders, reverseengineering tools, and antivirus software, often have little discrepancies in the way they interpret an input file. These differences can

be abused by attackers to evade detection or complicate reverse engineering, and are often found by researchers through a manual, trial-and-error process. In this paper, we present the first systematic analysis and exploration of PE parsers. To this end, we developed a framework to easily capture the details on how different software parses, checks, and validates whether a file is compliant with a set of specifications. We then used this framework to create models for the loaders of three versions of Windows (XP, 7, and 10) and for several reverseengineering and antivirus tools. Finally, we used this framework to automatically compare different models, generate new samples from a model, or validate an executable according to a chosen model. Our system also supports more complex tasks, such as “generating samples that would load on Windows 10 but not on Windows 7.” The results of our analysis have consequences on several aspects of system security.We show that popular analysis tools can be completely

bypassed, that the information extracted by these analysis tools can be easily manipulated, and that it is trivial for malware authors to fingerprint and “target” only specific versions of an operating system in ways that are not obvious to someone analyzing the executable. But, more importantly, we show that there is not one

correct way to parse PE files, and therefore that it is not sufficient for security tools to fix the many inconsistencies we found in our experiments. Instead, to tackle the problem at its roots, tools should allow the analyst to select which of the several loader models they should emulate.

Detail

Document

DOI

HAL

BIBTEX

Type:

Conference

City:

Donostia

Date:

2021-10-06

Department:

Digital Security

Eurecom Ref:

6603

© ACM, 2021. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in RAID 2021, 24th International Symposium on Research in Attacks, Intrusions and Defenses, 6-8 october 2021, Donostia/San Sebastian, Spain http://doi.org/10.1145/3471621.3471848