Application-based feature selection for internet traffic classification

En-Najjary, Taoufik; Urvoy-Keller, Guillaume; Pietrzyk, Marcin
ITC 2010, 22nd International Teletraffic Congress, September 7-9, 2010, Amsterdam, The Netherlands

Recently, several statistical techniques using flow features have been proposed to address the problem of traffic classification. These methods achieve in general high recognition rates of the dominant applications and more random results for less popular ones. This stems from the selection process of the flow features, used as inputs of the statistical algorithm, which is biased toward those dominant applications. As a consequence, existing methods are difficult to adapt to the changing needs of network administrators that might want to quickly identify dominant applications like p2p or HTTP based applications or to zoom on specific less popular (in terms of bytes or flows) applications on a given site, which could be HTTP streaming or Gnutella for instance. We propose a new approach, aimed to address the above mentioned issues, based on logistic regression. Our technique can automatically select distinct, per-application features that best separate each application from the rest of the traffic. In addition, it has a low computation cost and needs only to inspect the first few packets of a flow to classify it, which means that it can be implemented in real time. We exemplify our method using two recent traces collected on two ADSL platforms of a large ISP.


DOI
Type:
Conference
City:
Amsterdam
Date:
2010-09-07
Department:
Communication systems
Eurecom Ref:
3174
Copyright:
© 2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

PERMALINK : https://www.eurecom.fr/publication/3174