The project: collecting internet threats information using a worldwide distributed honeynet

Leita, Corrado;Pham, Van Hau;Thonnard, Olivier; Ramirez-Silva, Eduardo;Pouget, Fabien;Kirda, Engin;Dacier, Marc

This paper aims at presenting in some depth the project and its data collection infrastructure. Launched in 2003 by the Institut Eurecom, this project is based on a worldwide distributed system of honeypots running in more than 30 different countries. The main objective of the project is to get a more realistic picture of certain classes of threats happening on the Internet, by collecting unbiased quantitative data in a long-term perspective. In the first phase of the project, the data collection infrastructure relied solely on low-interaction sensors based on Honeyd [24] to collect unsolicited traffic on the Internet. Recently, a second phase of the project was started with the deployment of medium-interaction honeypots based on the ScriptGen [15] technology, in order to enrich the network conversations with the attackers. All network traces captured on the platforms are automatically uploaded into a centralized database accessible by the partners via a convenient interface. The collected traffic is also enriched with a set of contextual information (e.g. geographical localization and reverse DNS lookups). This paper presents this complex data collection infrastructure, and offers some insight into the structure of the central data repository. The data access interface has been developed to facilitate the analysis of today’s Internet threats, for example by means of data mining tools. Some concrete examples are presented to illustrate the richness and the power of this data access interface. By doing so, we hope to encourage other researchers to share with us their knowledge and data sets, to complement or enhance our ongoing analysis efforts, with the ultimate goal of better understanding Internet threats

Sécurité numérique
Eurecom Ref:
© 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.