Graduate School and Research Center in Digital Sciences

Practical size-based scheduling for MapReduce workloads

Pastorelli, Mario; Barbuzzi, Antonio; Carra, Damiano; Dell'Amico, Matteo; Michiardi, Pietro

arXiv:1302.2749, May 3rd, 2013

We present the Hadoop Fair Sojourn Protocol (HFSP) scheduler, which implements a size-based scheduling discipline for Hadoop. The benefits of size-based scheduling disciplines are well recognized in a variety of contexts (computer networks, operating systems, etc...), yet, their practical implementation for a system such as Hadoop raises a number of important challenges. With HFSP, which is available as an open-source project, we address issues related to job size estimation, resource management and study the effects of a variety of preemption strategies. Although the architecture underlying HFSP is suitable for any size-based scheduling discipline, in this work we revisit and extend the Fair Sojourn Protocol, which solves problems related to job starvation that affect FIFO, Processor Sharing and a range of size-based disciplines. Our experiments, in which we compare HFSP to standard Hadoop schedulers, pinpoint at a significant decrease in average job sojourn times - a metric that accounts for the total time a job spends in the system, including waiting and serving times - for realistic workloads that we generate according to production traces available in literature.

Document Arxiv Bibtex

Title:Practical size-based scheduling for MapReduce workloads
Type:Conference
Language:English
City:
Date:
Department:Data Science
Eurecom ref:4004
Copyright: © EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in arXiv:1302.2749, May 3rd, 2013 and is available at :
Bibtex: @inproceedings{EURECOM+4004, year = {2013}, title = {{P}ractical size-based scheduling for {M}ap{R}educe workloads}, author = {{P}astorelli, {M}ario and {B}arbuzzi, {A}ntonio and {C}arra, {D}amiano and {D}ell'{A}mico, {M}atteo and {M}ichiardi, {P}ietro}, booktitle = {ar{X}iv:1302.2749, {M}ay 3rd, 2013}, address = {}, month = {05}, url = {http://www.eurecom.fr/publication/4004} }
See also: