How many planet-wide leaders should there be?

Liu, Shengyun; Vukolic, Marko
DCC 2015, 3rd Workshop on Distributed Cloud Computing, co-located with ACM SIGMETRICS 2015, June 15th, Portland, USA

Geo-replication becomes increasingly important for modern planetary scale distributed systems, yet it comes with a speci c challenge: latency, bounded by the speed of light. In
particular, clients of a geo-replicated system must communicate with a leader which must in turn communicate with other replicas: wrong selection of a leader may result in unnecessary round-trips across the globe. Classical protocols such as celebrated Paxos, have a single leader making them unsuitable for serving widely dispersed clients. To address this issue, several all-leader geo-replication protocols have been proposed recently, in which every replica acts as a leader. However, because these protocols require coordination among all replicas, commiting a client's request at some replica may incure the so-called \delayed commit" problem, which can introduce even a higher latency than a classical single-leader majority-based protocol such as Paxos. In this paper, we argue that the \right" choice of the number of leaders in a geo-replication protocol depends on a given replica con guration and propose Droopy, an optimization for state machine replication protocols that explores the space between single-leader and all-leader by dynamically recon guring the leader set. We implement Droopy on top of Clock-RSM, a state-of-the-art all-leader protocol. Our evaluation on Amazon EC2 shows that, under typical imbalanced workloads, Droopy-enabled Clock-RSM efficiently reduces latency compared to native Clock-RSM, whereas in other cases the latency is the same as that of the native Clock-RSM.

DOI
Type:
Conference
City:
Portland
Date:
2015-06-15
Department:
Digital Security
Eurecom Ref:
4639
Copyright:
© ACM, 2015. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in DCC 2015, 3rd Workshop on Distributed Cloud Computing, co-located with ACM SIGMETRICS 2015, June 15th, Portland, USA http://dx.doi.org/10.1145/2847220.2847222
See also:

PERMALINK : https://www.eurecom.fr/publication/4639