Resource management for parallel processing frameworks with load awareness at worker side

Ha, Son-Hai; Brown, Patrick; Michiardi, Pietro
BIGDATA 2017, 6th IEEE International Congress on Big Data
June 25-30, 2017, Honolulu, Hawaii, USA

Many resource management systems and large-scale data processing frameworks use a reservation-based model for managing resources and scheduling tasks. We observe from the reported traces of Facebook and Google that this model leads to resource being wasted because the tasks do not use effectively the allocated resources. We confirm the problem with a trace of our production cluster. We propose an algorithm to estimate the resource usage at worker nodes. This estimation is used as an input for the scheduler at the resource manager. We verify the stability of the new system in a simulator and develop a prototype of this approach for YARN. Our results in the simulator show that the new model can flexibly match the actual demand of the workload to the capacity of the cluster avoiding resources over-reserved by users. Comparing the worst scenario of our management model and the best scenario of the reservation model, we obtain almost the same performance and comparable system stability. In practice, our prototype for YARN completes jobs faster from 23% to 44%.

Data Science
Eurecom Ref:
© 2017 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.