Graduate School and Research Center In communication systems


Eurecom - Networking and Security 
Phd Student ( 2011 - 2014)


Size-Based Disciplines for Job Scheduling in Data-Intensive Scalable Computing Systems


The past decade have seen the rise of \emph{data-intensive scalable computing} (DISC) systems, such as Hadoop, and the consequent demand for scheduling policies to manage their resources, so that they can provide quick response times as well as fairness.

Schedulers for DISC systems are usually focused on the fairness, without optimizing the response times. The best practices to overcome this problem include a manual and ad-hoc control of the scheduling policy, which is error-prone and difficult to adapt to changes.

In this thesis we focus on size-based scheduling for DISC systems. The main contribution of this work is the \emph{Hadoop Fair Sojourn Protocol} (HFSP) scheduler, a size-based preemptive scheduler with aging; it provides fairness and achieves reduced response times thanks to its size-based nature.  In DISC systems, job sizes are not known a-priori: therefore, HFSP includes a job size estimation module, which computes approximated job sizes and refines these estimations as jobs progress.

We show that the impact of estimation errors on the size-based policies is not significant, under conditions which are verified in a system such as Hadoop. Because of this, and by virtue of being designed around the idea of working with estimated sizes, HFSP is largely tolerant to job size estimation errors. Our experimental results show that, in a real Hadoop deployment and with realistic workloads, HFSP performs better than the built-in scheduling policies, achieving both fairness and small mean response time.  Moreover, HFSP maintains its good performance even when the cluster is heavily loaded, by focusing the resources to few selected jobs with the smallest size.

HFSP is a preemptive policy: preemption in a DISC system can be implemented with different techniques. Approaches currently available in Hadoop have shortcomings that impact on the system performance. Therefore, we have implemented a new preemption technique, called \emph{suspension}, that exploits the operating system primitives to implement preemption in a way that guarantees low latency without penalizing low-priority jobs.