Bernhard Ager - Technische Universität Berlin An-Institut Deutsche Telekom Laboratories Corporate communication
Date: - Location: Eurecom
Today's Internet traffic is dominated by users' demand for exchanging content. In particular, multi-media content, i.e., photos, music, and video, as well as software downloads and updates contribute substantially to today's Internet traffic. One option for reducing network costs is to use caches---exploiting the observation that content popularity is consistent with Zipf's law. In this work we start out by exploring the potential traffic savings due to caching of the most prominent protocols in our environment: HTTP, BitTorrent, eDonkey, and NNTP. We base our analysis on anonymized traces from a large European ISP connecting more than 20,000 residential DSL customers to the Internet, collected in 2009. We find, that on the one hand P2P applications show in general very high cacheability rates, but on the other hand HTTP cacheability is often hampered by the server-side configuration of content distribution networks (CDNs). CDNs own widely distributed server farms with thousands of hosts and perform their own traffic flow optimization. While content delivery systems may, to some extent, consider the user's performance within their optimization criteria, they currently have no incentive to consider any of the ISP's constraints. As a consequence, the ISP has ``lost control'' over a major part of its traffic. We find that a CDN's content caches are ubiquitous and reply to any request for content hosted by the particular CDN. Thus another option for saving traffic is to prefer nearby content hosts instead of the one selected by the CDN alone. Both users and ISPs can benefit from biasing the location of the content cache. Interestingly not only the location of the client is dictating CDNs' server selection, but in today's Internet the location of the DNS server has an enormous impact. We compare local DNS resolvers against GoogleDNS and OpenDNS for a large set of vantage points. Our end-host measurements inside over 50 commercial ISPs reveal significant diversity, even at the AS-level, among the answers provided by the studied DNS resolvers. We attribute this diversity to the location-awareness of CDNs as well as to the location of DNS resolvers that breaks the assumption made by CDNs about the vicinity of the end-user and its DNS resolver. In addition we observe that two aspects have a significant impact on responsiveness: (1) the latency to the DNS resolver, (2) the content of the DNS cache when the query is issued.