Embedded systems are omnipresent in our everyday life and are becoming increasingly present in many computing and networked environments. For example, they are at the core of various Common-Off-The-Shelf (COTS) devices such as printers, video surveillance systems, home routers and virtually anything we informally call electronics. The emerging phenomenon of the Internet-of-Things (IoT) will make them even more widespread and interconnected. Cisco famously predicted that there will be 50 billion connected embedded devices by 2020.
Given those estimations, the heterogeneity of technology and application fields, and the current threat landscape, the security of all those devices becomes of paramount importance. In addition to this, manual security analysis does not scale. Therefore, novel, scalable and automated approaches are needed.
In this thesis, we present several methods that make feasible the large scale security analysis of embedded devices. We implemented those techniques in a scalable framework that we tested on real world data. First, we collected a large number of firmware images from Internet repositories. Then we unpacked a large subset of them and performed simple static analysis. This resulted in the discovery of many new vulnerabilities. Also, this allowed us to identify five important challenges.
Embedded devices often expose web interfaces for remote administration. Therefore, we developed techniques for large scale static and dynamic analysis of such interfaces. This allowed us to find a large number of new vulnerabilities and to identify the limitations of emulation and web security tools.
Finally, identifying and classifying the firmware files is difficult, especially at large scale. For these reasons, we proposed Machine Learning (ML) techniques and features for firmware files classification. Also, we developed multi-metric score fusion approaches to fingerprint and identify embedded devices at the web interface level.
Using these techniques, we were able to discover a large number of new vulnerabilities in a large number of firmware packages, affecting a great variety of vendors and device classes. We were also able to achieve high accuracy in fingerprinting and classification of both firmware images and live devices.