Root cause analysis of TCP throughput : methodology, techniques and applications

Siekkinen; Matti


The interest for the research community to measure the Internet has grown tremendously during the last couple of years. This increase of interest is largely due to the growth and expansion of the Internet that has been overwhelming. We have experienced exponential growth in terms of traffic volumes and number of devices connected to the Internet. In addition, the heterogeneity of the Internet is constantly increasing: we observe more and more different devices with different communication needs residing in or moving between different types of networks. This evolution has brought up many needs – commercial, social, and technical needs – to know more about the users, traffic, and devices connected to the Internet. Unfortunately, little such knowledge is available today and more is required every day. That is why Internet measurements has grown to become a substantial research domain today. This thesis is concerned with TCP traffic. TCP is estimated to carry over 90% of the Internet’s traffic, which is why it plays a crucial role in the functioning of the entire Internet. The most important performance metrics for applications is typically throughput, i.e. the amount of data transmitted over a period of time. Our definition of the root cause analysis of TCP throughput is the analysis and inference of the reasons that prevent a given TCP connection from achieving a higher throughput. These reasons can be many: application, network, or even the TCP protocol itself. This thesis comprises three parts: methodology, techniques, and applications. The first part introduces our database management system-based methodology for passive traffic analysis. In that part we explain our approach, the InTraBase, which is based on an object-relational database management system. We also describe our prototype of this approach, which is implemented on PostgreSQL, and evaluate and optimize its performance. In the second part, we present the primary contributions of this thesis: the techniques for root cause analysis of TCP throughput. We introduce the different potential causes that can prevent a given TCP connection to achieve a higher throughput and explain in detail the algorithms we developed and used to detect such causes. Given the large heterogeneity and potentially large impact of applications that operate on top of TCP, we emphasize their analysis. The core of the third part of this thesis is a case study of traffic originating from clients of a commercial ADSL access network. The study focuses on performance analysis of data transfers from a point of view of the client. We discover some surprising results, such as poor overall performance of P2P applications for file distribution due to upload rate limits enforced by client applications. The third part essentially binds the two first ones together: we give an idea of the capabilities of a system combining the methodology of the first part with the techniques of the second part to produce meaningful results in a real world case study.

