In this talk I will discuss our recent work, together with Sebastian Poeplau, on Symbolic execution. Symbolic execution has become a popular technique for software testing and vulnerability detection, in particular, because it allows to generate test cases for difficult to reach program paths. However, a major impediment to practical symbolic execution is speed, especially when compared to near-native speed solutions like fuzz testing.
We first discuss an extensive evaluation (published at ACSAC 2019) of the current symbolic execution tools (Angr, Klee, Qsym). Most implementations transform the program under analysis to some intermediate representation (IR), which is then used as a basis for symbolic execution. There is a multitude of available IRs, and even more approaches to transform target programs into a respective IR. Therefore, we developed a methodology for systematic comparison of different approaches to symbolic execution; we then use it to evaluate the impact of the choice of IR and IR generation.
We will then present SYMCC: our compilation-based approach to symbolic execution. SymCC is an LLVM-based C and C++ compiler that builds concolic execution right into the binary and performs better than state-of-the-art implementations by orders of magnitude. It can be used by software developers as a drop-in replacement for clang and clang++. Using SymCC on real-world software, we found that SymCC consistently achieves higher coverage, and we discovered two vulnerabilities in the heavily tested OpenJPEG project, which have been confirmed by the project maintainers and assigned CVE identifiers.
SymCC received a distinguished paper award at Usenix Security 2020.