Loading [MathJax]/extensions/MathMenu.js
On Using Raft Over Networks: Improving Leader Election | IEEE Journals & Magazine | IEEE Xplore

On Using Raft Over Networks: Improving Leader Election


Abstract:

Raft is a state-of-the-art consensus algorithm for state replication over a distributed system of nodes. According to Raft, all state updates occurring anywhere in the sy...Show More

Abstract:

Raft is a state-of-the-art consensus algorithm for state replication over a distributed system of nodes. According to Raft, all state updates occurring anywhere in the system are forwarded to the leader, which is elected among the system nodes to collect and replicate these updates to all other nodes. Thus, the time required for the state replication, named as system response time, depends on the delays between the leader and all other nodes. After multiple node failures and leadership transitions, each node can be leader with a probability that affects the expected response time. The leadership probabilities, in turn, are affected by the random intervals that nodes are waiting, after detecting a leader failure and before competing for the successive leadership. The Raft designers suggest the ranges of these intervals to be equal for all nodes. However, this may result in increased expected response time. In this paper, mathematical models are presented for estimating the ranges resulting in the desired leadership probabilities. The presented theoretical results are also confirmed by testbed experimentation with an open-source and widely used Raft implementation.
Published in: IEEE Transactions on Network and Service Management ( Volume: 19, Issue: 2, June 2022)
Page(s): 1129 - 1141
Date of Publication: 31 January 2022

ISSN Information:

Funding Agency:

References is not available for this document.

I. Introduction

Distributed systems receive extensive attention nowadays, that networking technologies are flourishing and time-critical system functions are spread over multiple interconnected nodes. The emerging Software Defined Networking (SDN) excels in assisting distributed systems, while in parallel is assisted by distributed systems, since distributed SDN controller clusters are more efficient than single-instance controllers [2]. The most popular open-source SDN controllers, such as ODL [3] and ONOS [4], are fundamentally designed to support clustering. Similarly, Necklace [5] provides an architecture for distributed Service Function Chaining that performs surprisingly well. Finally, Kubernetes [6], OpenStack [7] and Hyperledger-Fabric [8] are a few examples of widely used systems with increased scalability and efficiency, due to their distributed operation, which is assisted by etcd [9] with distributed key-value store. However, these systems require a protocol for reaching consensus between their nodes.

Select All
1.
K. Choumas and T. Korakis, "When raft meets SDN: How to elect a leader over a network", Proc. IEEE NetSoft, pp. 140-141, 2020.
2.
F. Bannour, S. Souihi and A. Mellouk, "Distributed SDN control: Survey taxonomy and challenges", IEEE Commun. Surveys Tuts., vol. 20, no. 1, pp. 333-354, 1st Quart. 2018.
3.
J. Medved, R. Varga, A. Tkacik and K. Gray, "OpenDaylight: Towards a model-driven SDN controller architecture", Proc. IEEE WoWMoM, pp. 1-6, 2014.
4.
P. Berde et al., "ONOS: Towards an open distributed SDN OS", Proc. HotSDN, pp. 1-6, 2014.
5.
F. Esposito et al., "Necklace: An architecture for distributed and robust service function chains with guarantees", IEEE Trans. Netw. Service Manag., vol. 18, no. 1, pp. 152-166, Mar. 2021.
6.
B. Burns, B. Grant, D. Oppenheimer, E. Brewer and J. Wilkes, "Borg omega and Kubernetes: Lessons learned from three container-management systems over a decade", ACM Queue, vol. 14, no. 1, pp. 70-93, 2016.
7.
O. Sefraoui, M. Aissaoui and M. Eleuldj, "OpenStack: Toward an open-source solution for cloud computing", Int. J. Comput. Appl., vol. 55, no. 3, pp. 38-42, 2012.
8.
Hyperledger Fabric: Distributed Ledger Software, Feb. 2022, [online] Available: https://www.hyperledger.org/use/fabric.
9.
etcd: Distributed Reliable Key-Value Store for the Most Critical Data of a Distributed System, Feb. 2022, [online] Available: https://etcd.io/.
10.
D. Ongaro and J. Ousterhout, "In search of an understandable consensus algorithm", Proc. USENIX ATC, pp. 305-320, 2014.
11.
The Raft Consensus Algorithm, Feb. 2022, [online] Available: https://raft.github.io/.
12.
H. Howard, M. Schwarzkopf, A. Madhavapeddy and J. Crowcroft, "Raft refloated: Do we have consensus?", ACM SIGOPS Oper. Syst. Rev., vol. 49, no. 1, pp. 12-21, 2015.
13.
K. Choumas, D. Giatsios, P. Flegkas and T. Korakis, "The SDN control plane challenge for minimum control traffic: Distributed or centralized?", Proc. IEEE CCNC, pp. 1-7, 2019.
14.
M. Karatisoglou, K. Choumas and T. Korakis, "Controller placement for minimum control traffic in OpenDaylight clustering", Proc. IEEE WF-5G, pp. 353-358, 2019.
15.
Y. Zhang, B. Han, Z.-L. Zhang and V. Gopalakrishnan, "Network-assisted raft consensus algorithm", Proc. SIGCOMM Posters Demos, pp. 94-96, 2017.
16.
E. Sakic and W. Kellerer, "Response time and availability study of RAFT consensus in distributed SDN control plane", IEEE Trans. Netw. Service Manag., vol. 15, no. 1, pp. 304-318, Mar. 2018.
17.
H. I. Kobo, A. M. Abu-Mahfouz and G. P. Hancke, "Efficient controller placement and reelection mechanism in distributed control system for software defined wireless sensor networks", Trans. Emerg. Telecommun. Technol., vol. 30, no. 6, pp. e3588, 2019.
18.
R. Hanmer, L. Jagadeesan, V. Mendiratta and H. Zhang, "Friend or foe: Strong consistency vs. overload in high-availability distributed systems and SDN", Proc. ISSREW, pp. 59-64, 2018.
19.
C. Fluri, D. Melnyk and R. Wattenhofer, "Improving raft when there are failures", Proc. LADC, pp. 167-170, 2018.
20.
D. Huang, X. Ma and S. Zhang, "Performance analysis of the raft consensus algorithm for private blockchains", IEEE Trans. Syst. Man Cybern. Syst., vol. 50, no. 1, pp. 172-181, Jan. 2020.
21.
Network Implementation Testbed using Open Source platforms (NITOS), Feb. 2022, [online] Available: https://nitlab.inf.uth.gr/NITlab/nitos.
22.
Postman: API Platform, Feb. 2022, [online] Available: https://www.postman.com/.

Contact IEEE to Subscribe

References

References is not available for this document.