Title: Caching and Multicast Distribution of Web Documents on the Internet

PhD Student: Pablo Rodriguez

Supervisor: Prof.Dr.Ernst W.Biersack

Key Words: Web, Caching, Push, Multicast, FEC.



Description:

The success of the World Wide Web has led to a steep increase in the user population and has increased the amount of traffic in the Internet. This growth has generated mainly three problems (i) scarce network bandwidth (ii) high latencies, and (iii) overloaded servers. To improve the performance of the Web and alleviate these problems we consider Web caching and multicast.

Currently, there is a large class of documents containing ``short-lived'' information such as news, stock market quotes, or auction data that, besides being of interest to a large number of receivers, change frequently. For these kind of documents we have proposed a a continuous multicast push (CMP) from the servers to the receivers. A Web server continuously multicasts the latest version of the popular and frequently changing Web document in a multicast address. Receivers tune into the multicast address for the time required to reliably receive the document and then leave the group. To reliably obtain the Web documents, we considered forward error correction codes (FEC).

Caching, like multicast, is another mechanism for reducing the bandwidth usage and latency to the receivers on the Internet. ISPs throughout the world are aggressively introducing caches in order to reduce latency and bandwidth usage. Many of these caches cooperate through a caching hierarchy. We have found that except for fast-changing documents (i.e., several minutes), hierarchical caching performs better than a multicast distribution in terms of latency and bandwidth. Hierarchical caching provides asynchronous reliable multicast without the complexity of managing dynamic multicast trees and heterogeneous receivers.

Recently, there has been a considerable increase in the number of Web sites that offer automated delivery of personalized an up-to-date information. Clients subscribe to a document and new document updates get to the clients without them making explicit requests. Automated delivery is usually implemented with a pull-push mechanism where the clients periodically poll the origin servers. When the information is randomly updated and clients must immediately receive the most up-to-date document version, the origin server is burdened with many poll request and pull-push does not scale. For this latter case, we are considering a pure push distribution through a caching infrastructure.

Prefetching documents into Web caches can significantly increase the hit rates. However, prepopulating caches has two main inconveniences: additional disk space and network bandwidth. Disk space may not be such a problem because disk capacity is increasing very fast and large disks are becoming cheaper and cheaper. A more serious problem is network bandwidth, especially when documents are prefetched through the Internet. An alternative way to prefetch caches with many documents is through a Satellite distribution, where the congestion problems and the losses are fewer than in the Internet. A satellite distribution of Web documents into large caches can significantly increase the hit rates bringing the Web to the network edge.

The number of multimedia documents, including audio and video objects, is rapidly increasing in the Internet. Even though large disks are becoming cheaper and cheaper, in a few years caches may not have enough disk capacity to store large video and audio files. As consequence, there is a need for better are more efficient caching infrastructures which exploit the nature and streaming properties of audio and video objects.
 

Slides


This is BibTeX, Version 0.99c (Web2C 7.3.1) The top-level auxiliary file: /tmp/bib2html4461.aux The style file: html-my.bst Database file #1: /tmp/temp1.bib
Publications

[BRF04]
E. W. Biersack, P. Rodriguez, and P. Felber. Performance Analysis of Peer-to-Peer Networks for File Distribution. In Proceedings of the Fifth International Workshop on Quality of Future Internet Services (QofIS'04), Barcelona, Spain, September 2004.
[RB02a]
Pablo Rodriguez and Ernst W. Biersack. Bringing the Web to the Network Edge: Large Caches and Satellite Distribution. Mobile Networks and Applications, 7(1):67--78, January 2002.
[RB02b]
Pablo Rodriguez and Ernst W. Biersack. Dynamic Parallel-Access to Replicated Content in the Internet. IEEE/ACM Transactions on Networking, 10(4):455--464, August 2002.
[RSB01]
Pablo Rodriguez, Christian Spanner, and Ernst W. Biersack. Analysis of Web Caching Architectures: Hierarchical and Distributed Caching. IEEE/ACM Transactions on Networking, 9(4):404--418, August 2001.
[HRB00]
Xiao-Yu Hu, Pablo Rodriguez, and Ernst W. Biersack. Performance Study of Satellite-Linked Web Caches and Filtering Policies. In Networking 2000, pages 580--595, Paris, May 2000.
[RKB00]
Pablo Rodriguez, Andreas Kirpal, and Ernst W. Biersack. Parallel-Access for Mirror Sites in the Internet. In Proc. of Infocom, Tel-Aviv, Israel, March 2000.
[RS00]
Pablo Rodriguez and Sandeep Sibal. SPREAD: Scalable Platform for Reliable and Efficient Automated Distribution. EURECOM, AT\&T Research Labs, In Proc. of the 9th World Wide Web Conference, May 2000.
[RSS00]
Pablo Rodriguez, Sandeep Sibal, and Oliver Spatscheck. TPOT: Translucent Proxying of TCP. AT\&T Research Labs, In Proc. of the 4th International Caching Workshop, Lisbon. Also as Technical report TR 00.4.1, 2000.
[Rod00]
Pablo Rodriguez. Scalable Content Distribution in the Internet. PhD thesis, EPFL, Laussane. Switzerland, Institut Eurecom, October 2000.
[RSB99]
Pablo Rodriguez, Christian Spanner, and Ernst W. Biersack. Web Caching Architectures: Hierarchical and Distributed Caching. In Proceedings of the 4th International Caching Workshop, San Diego, March 1999.
[RBR98]
Pablo Rodriguez, Ernst W. Biersack, and Keith W. Ross. Improving the Latency in the Web: Caching or Multicast?. In 3rd International WWW Caching Workshop, Manchester, UK, June 1998.
[RGN98]
Pablo Rodriguez, Jamel Gafsi, and Jorg Nonnenmacher. A more Attractive and Interactive TV. In W3C Workshop on Television and the Web, Sophia Antipolis, France, July 1998. Position Paper.
[RRB98]
Pablo Rodriguez, Keith W. Ross, and Ernst W. Biersack. Distributing Frequently-Changing Documents in the Web: Multicasting or Hierarchical Caching. Computer Networks and ISDN Systems. Selected Papers of the 3rd International Caching Workshop, pages 2223--2245, 1998.
[RW98]
P. Rodriguez and Ernst W.Biersack. Continuous Multicast Push of Web Documents over the Internet. IEEE Network Magazine, 12, 2:18--31, Mar-Apr 1998.

Btroup Home Page