On the design space of MapReduce ROLLUP aggregates

Phan, Duy-Hung; Dell'Amico, Matteo; Michiardi, Pietro
ICDT 2014, 17th International Conference on Database Theory, 24-28 March 2014, Athens, Greece, in conjunction with EDBT/ICDT 2014

We de ne and explore the design space of ecient algorithms to compute ROLLUP aggregates, using the MapReduce programming paradigm. Using a modeling approach, we explain the non-trivial trade-off that exists between parallelism and communication costs that is inherent to a MapReduce implementation of ROLLUP. Furthermore, we design a new family of algorithms that, through a single parameter, allow to nd a \sweet spot" in the parallelism vs. communication
cost trade-o . We complement our work with an experimental approach, wherein we overcome some limitations of the model we use. Our results indicate that ecient ROLLUP aggregates require striking the good balance between parallelism and communication for both one-round and chained algorithms.

Type:
Conférence
City:
Athens
Date:
2014-03-28
Department:
Data Science
Eurecom Ref:
4212
Copyright:
© ACM, 2014. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ICDT 2014, 17th International Conference on Database Theory, 24-28 March 2014, Athens, Greece, in conjunction with EDBT/ICDT 2014

PERMALINK : https://www.eurecom.fr/publication/4212