Flexible scheduling of distributed analytic applications

Pace, Francesco; Venzano, Daniele; Carra, Damiano; Michiardi, Pietro
Technical Report, November 29th, 2016

This work addresses the problem of scheduling user-defined analytic applications, which we define as high-level compositions of frameworks, their components, and the logic necessary to carry out work. The key idea in our application definition, is to distinguish classes of components, including rigid and elastic types: the first being required for an application to make progress, the latter contributing to reduced execution times. We show that the problem of scheduling such applications poses new challenges, which existing approaches address inefficiently. 

Thus, we present the design and evaluation of a novel, flexible heuristic to schedule analytic applications, that aims at high system responsiveness, by allocating resources efficiently. Our algorithm is evaluated using trace-driven simulations, with large-scale real system traces: our flexible scheduler outperforms a baseline approach across a variety of metrics, including application turnaround times, and resource allocation efficiency. 

We also present the design and evaluation of a full-fledged system, which we have called Zoe, that incorporates the ideas presented in this paper, and report concrete improvements in terms of efficiency and performance, with respect to prior generations of our system.

Data Science
Eurecom Ref:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Technical Report, November 29th, 2016 and is available at :

PERMALINK : https://www.eurecom.fr/publication/5083