Recently, several statistical techniques using flow features have been proposed
to address the problem of traffic classification. These methods achieve in general
high recognition rates of the dominant applications and more random results for
less popular ones. This stems from the selection process of the flow features, used
as inputs of the statistical algorithm, which is biased toward those dominant applications.
As a consequence, existing methods are difficult to adapt to the changing
needs of network administrators that might want to quickly identify dominant applications
like p2p or HTTP based applications or to zoom on specific less popular
(in terms of bytes or flows) applications on a given site, which could be HTTP
streaming or BitTorrent for instance. We propose a new approach, aimed to address
the above mentioned issues, based on logistic regression. Our technique incorporates
the following features: i) Automatic selection of distinct, per-application
features set that best separates it from the rest of the traffic ii) Real time implementation
as it needs only to inspect the first few packets of a flow to classify it, (iii)
Low computation cost as logistic regression is implemented by comparing a linear
combination of a flow features with a fixed threshold value, (iv) Ability to handle
application types that former methods failed to classify. We validate the method
using two recent data sets collected on two ADSL platforms of a large ISP.