Introduction
Since “an electronic cash currency completely realized through peer-to-peer technology” (i.e., Bitcoin) [1] was proposed in 2008, blockchain technology has received increasing attention. The essence of blockchain is a decentralized distributed ledger database [2], which integrates peer-to-peer (P2P) networks, consensus algorithms, cryptology principles [3], smart contracts and other technologies. Due to its security, decentralization, tamper-proof, and traceability, blockchain technology is widely used in digital currency, finance, health care, energy, IoT, and other fields [4].
The consensus algorithm, as the core technology of blockchain, especially permissioned blockchain, enables distrustful participants to verify the validity of data in the blockchain system in a decentralized manner, ensuring that each node of the blockchain can record and store data according to the same uniform standard of behavior, thus reaching a consensus of the whole network. According to the degree of decentralization and application scenarios, blockchain can be divided into three main categories: private chain, public chain, and consortium chain [5]. Private chains are blockchain systems that are open to individuals or organizations. The most representative consensus algorithms of private chains are Paxos [6], Raft [7], etc. These consensus algorithms generally only consider failures due to node and network problems (such as node downtime, network failure, etc.) and do not consider the malicious nodes in the cluster. Therefore, the Byzantine fault tolerance problem cannot be solved. Compared to other chains, public chains are the most centralized chains, such as Bitcoin [1] and Ethereum [8]. The most famous consensus algorithm of the public chain is proof-of-work (PoW), which is completely decentralized and open to all nodes that can access the internet. Although PoW can effectively resist Sybil attacks and provide security for the system when the computing power of malicious nodes does not exceed 50%, its mechanism of relying on the computing power of computers to complete consensus brings the problem of wasted resources, its transaction confirmation latency is high, and throughput is low. Compared with the PoW algorithm, the proof-of-stake (PoS) [9] algorithm introduces coin days to dynamically adjust the mining difficulty to reduce computing power consumption. As a result, the PoS algorithm mines faster and consumes fewer resources. Delegated proof of stake (DPoS) [10] uses a scheme based on node weights to elect proxy nodes to cancel mining. As the number of nodes participating in the consensus is greatly reduced, the time to generate blocks is shortened, and consensus efficiency is improved. However, it may cause some nodes with more resources to control the election process, which can easily destroy the network.
To better protect user privacy and supervise data, consortium chains are proposed. In the application scenario of the consortium blockchain, the Byzantine fault tolerance (BFT) algorithm is the most commonly used consensus algorithm, which was proposed by Lamport [11] in 1982, creating a precedent for the Byzantine Fault Tolerance algorithm. Due to the high complexity of the BFT algorithm, it has yet to be practically applied. Castro et al. [12] proposed a practical Byzantine fault tolerance (PBFT) algorithm in 1999, which improved the BFT algorithm’s inefficiency and achieved fault tolerance for a certain number of Byzantine nodes to provide security and activity in asynchronously distributed systems. Additionally, the algorithm complexity is reduced from an exponential level to a polynomial level. Compared with other consensus algorithms, the PBFT algorithm does not need to consume many computing resources and has the characteristics of high throughput and low latency, so it is widely used in permissioned chain systems.
At present, the evaluation indicators of consensus algorithms are different, and different scenarios have different focuses, including fairness and decentralization degree from the perspective of sociology, consensus energy consumption and economic cost from the perspective of economics, and scalability, fault tolerance, and security from the perspective of computer science [13]. How to adaptively design a consensus mechanism for a specific performance evaluation target in combination with specific requirements and application scenarios to optimizethe algorithm [14], [15] is an urgent problem to be solved. With the gradual expansion of the scale of the application scenario of the permission chain, the scalability problem of the PBFT algorithm has become increasingly prominent. Therefore, to reduce the communication complexity of the PBFT algorithm and improve the applicability of the permission chain in large-scale application scenarios, this paper proposes a scalable Byzantine fault-tolerant algorithm based on tree topology networks (STBFT). Specifically, the main contributions of this paper are as follows:
First, we change the mesh network topology of the PBFT algorithm to a tree network topology so that the STBFT algorithm is suitable for hierarchical network systems. The STBFT algorithm divides the network nodes into several subnets, and each subnet has a primary node to send the consensus result to the client.
Second, for the hierarchical network system, the higher the Byzantine node level is, the greater the threat to the consensus network. To make the Byzantine node unable to predict the grouping information, a random grouping strategy based on a verifiable random function (VRF) is proposed to prevent Byzantine node targeted attacks and collusion from affecting the normal consensus of the system, thus ensuring the security of the system.
Third, scalability and fault tolerance cannot be well balanced in hierarchical network systems. Therefore, the feedback mechanism is proposed to monitor and relay the nodes’ behavior, reduce Byzantine failure’s influence on hierarchical network systems, improve the fault tolerance performance of algorithms, and enhance the security and activity of the system.
Related Work
The most significant difference between the blockchain and the traditional distributed consensus system is its complex operating environment. Even in the strict consortium chain, Byzantine nodes will be introduced. Therefore, since the birth of blockchain technology, Byzantine fault-tolerant consensus algorithms have gradually increased. Given the deficiencies of the PBFT algorithm, in recent years, researchers have mainly researched the process optimization of the PBFT algorithm and the consensus system architecture, resulting in the birth of many excellent Byzantine fault tolerance consensus algorithms.
For example, to improve the performance of the consortium chain, the tPBFT algorithm [16] introduces a trust equity scoring mechanism. Consensus nodes can be dynamically adjusted and simplify the process of consensus, and the efficiency of consensus and node scalability outperforms the PBFT algorithm. Yin et al. [17] proposed HotStuff, which changed the network topology of PBFT to star network topology, adopted threshold signature technology, collected signatures by the primary node, instead of using the many-to-many broadcast communication mode, and used parallel pipelining to process transactions. The HotStuff algorithm significantly improves the efficiency of consensus, but the primary node bears a greater communication burden, which will quickly lead to network congestion. The DPNPBFT algorithm [18] selects dual primary nodes to avoid excessive centralization of a single primary node. The two primary nodes check balance and supervise each other. Therefore, the DPNPBFT algorithm has good system security. Wan et al. [19] proposed the AnonymousFox consensus algorithm, designed an anonymous leader node sorting algorithm to hide the leader node identity, and reduced node communication overhead through one-to-many message communication. The AnonymousFox algorithm improves the algorithm’s throughput and guarantees the system’s security. In addition, the number of nodes participating in the consensus directly impacts the communication complexity of the PBFT consensus algorithm, so researchers try to reduce the number of nodes participating in the consensus. A series of algorithms based on the “committee + PBFT” model have been proposed [20], [21], [22], [23], [24], [25], which selects a subset of the nodes as the committee from all nodes in the system based on their credit, tokens, etc. The committee node participates in the consensus protocol on behalf of all nodes, and other nodes conduct verification and synchronization of the entire network, realizing the rapid consensus of the network. To prevent the selected nodes from being too concentrated to threaten the system’s security, the Algorand algorithm [26] uses VRF [27] to randomly select a subset of network nodes as a committee to ensure the unpredictability of committee members. In the consensus algorithm of the “committee + PBFT” model, the performance and scalability of the algorithm are improved due to the reduction of the consensus network scale. However, the functions of the committee node are higher than those of the general node, and the power is too concentrated in the committee node, which violates the decentralized nature of the blockchain. The above algorithms significantly improve the consensus efficiency, but malicious nodes remain in the consensus system, and system security will not be guaranteed. Therefore, Hao et al. [28] and Jiang and Lian [29] verified the identity information of nodes joining the system by introducing centralized authentication nodes and designing related protocols for node joining, exiting, and other operations. These algorithms not only inherit the advantages of the PBFT algorithm but also have better dynamic properties and higher robustness. Kim et al. [30] considered that the “committee + PBFT” model has a limit that a malicious cartel with more than 1/3 of the total stack can launch persistent center attacks, which disrupt the consensus process. Therefore, this problem is solved by introducing the concept of main-validators and sub-validator and proposing a behavior-based credit-scoring function.
The biggest problem with the PBFT algorithm is its high communication complexity and poor scalability. However, optimizing the PBFT algorithm process alone is not enough to solve application problems in large-scale scenarios. A series of new consensus frameworks have been proposed, whether horizontal sharding or vertical hierarchical expansion is adopted, which essentially divides nodes into independent spaces. Elastico [31] is the first Byzantine fault-tolerant algorithm based on sharding technology. It selects committee nodes in the network through PoW algorithm competition and assigns the elected committee nodes to different groups. Each group executes the PBFT algorithm to process different transaction sets in parallel, and the sharding technology improves system performance through parallel processing. The acceptable number of malicious users using the BFT algorithm is limited, if users collude with malicious users and exceed the limit, consensus cannot be reached. Jeon et al. [32] proposed a randomized mesh blockchain using the diversity of opinion Byzantine fault tolerance to solve this problem. Some efficient network partitioning methods have been proposed to address the specific requirements of consensus algorithms in certain blockchain application scenarios and achieve better results in practical applications. Lin et al. [33] proposed a grouping strategy based on link identification with a supervision mechanism that can better adapt to the consortium chain supervision system. Yoo et al. [34] proposed location-based sharding, which divides nodes into many groups based on location. Consensus nodes are clustered by an improved K-medoids clustering algorithm [35], and the nodes with higher trust are selected as clustering centers. However, only from the perspective of plane architecture can the status quo no longer be met, and an increasing number of algorithms have been researched on hierarchical structures. Feng et al. [36] proposed a scalable dynamic multi-agent hierarchical SDMA-PBFA algorithm, which divides nodes into several groups. Each group elects an agent and executes the PBFT algorithm in each group. Finally, all agents send the consensus result to the client. Li et al. [37] proposed SHBFT, which optimized many aspects based on PBFT, designed a hierarchical structure to speed up the consensus process and achieved global consensus through local consensus. Li et al. [38] proposed a scalable multi-layer consensus mechanism based on the PBFT algorithm, which significantly reduces communication complexity and provides guidance for the design and performance analysis of hierarchical systems.
It can be seen from the above research that these algorithms optimize the PBFT algorithm from different angles. Although many problems in the PBFT algorithm can be solved, in the case of many nodes, only optimizing the PBFT algorithm process is far from sufficient. Therefore, a series of consensus algorithms start from the system architecture and divide the nodes into local consensus through a group-based or hierarchical structure, which significantly reduces the communication complexity and improves the system scalability, fundamentally solving the problem that the PBFT algorithm cannot be applied to a large-scale network environment. However, the consensus algorithm of a group-based or hierarchical structure has the following shortcomings. First, the existing research method of network node division is determined by the system model and has not been authenticated by the nodes of the whole network. Second, the current node division rarely considers the collusion between malicious Byzantine nodes, so a considerable threat to the system’s security exists. Third, scalability and fault tolerance are relatively large contradictions in the hierarchical system. Although the hierarchical system can solve the scalability problem of the PBFT algorithm, the system also has great instability and is easily affected by Byzantine nodes.
To this end, we conduct in-depth research from the system model of the PBFT algorithm. To solve the application problem of the PBFT algorithm in specific scenarios in the consortium chain, we propose the STBFT algorithm, which is also a hierarchical structure based on tree topology network. However, our algorithm is different from other algorithms in several ways.
First, network nodes are randomly and secretly assigned to different layers and groups, and the grouping information of each node is verified and confirmed by other nodes.
Second, since the tree topology is a unique network structure, we design the system model in a targeted manner. The nodes of each layer will collect corresponding evidence from supervising and providing feedback to the nodes of the upper and lower layers.
Finally, we have solved how to activate the system to ensure the consensus consistency of the algorithm when the nodes are Byzantine nodes.
Therefore, our algorithm can not only improve the system’s scalability but also deal with scenarios that require high fault tolerance, making the blockchain consensus algorithm more applicable.
Styling STBFT Algorithm Design
In this section, we elaborate on the design of the STBFT algorithm in detail. We first describe the system model of the STBFT algorithm. Then, we overview the process of the consensus algorithm. Finally, we present random grouping strategy and feedback mechanism to minimize the impact of malicious nodes on the system.
A. System Model
The nodes in the tree topology network are arranged in a tree shape, which resembles an inverted tree, hence the name. The tree topology is easy to expand, so new branches or nodes can be added to the network. It is easy to isolate the fault site from the whole system; for example, if one branch node fails, it mainly affects the local area. The tree topology network structure is a hierarchical structure that is suitable for a hierarchical network system. Therefore, we propose a multi-centralized, “group-based and hierarchical” STBFT algorithm based on a tree topology network. We divide the nodes in the network into several layers. Each layer is composed of different areas, and the whole network consensus is divided into several subnetwork local consensuses. Each parent node (nonleaf node) forms a subnetwork with its child nodes, which we call the consensus domain. The network composed of child nodes is called the group domain (group). Figure 1 briefly shows the system model of the STBFT algorithm.
B. STBFT Algorithm Consensus Process
According to the hierarchical structure of the tree topology network, the STBFT algorithm improves the consensus protocol of the PBFT algorithm. First, the primary node in the consensus domain no longer participates in the consensus of its child nodes (slave nodes), and the primary node of each layer is only responsible for sending the request message to the child nodes and collecting the voting results of its child nodes. Second, only the node that has generated the pre-prepare message of certificate
Figure 2 shows the consensus process of the two-layer STBFT algorithm. We assume that the number of consensus nodes in each group domain is
The root node of the system forms a consensus domain with the nodes in the group domain
When a node receives more than
C. Random Grouping Strategy Based on VRF
For hierarchical systems, the higher the level of nodes, the greater the impact on the system. If Byzantine nodes collude maliciously and are at a high level of the system, the system’s security will be threatened. Therefore, to make each node randomly distributed in the system, the grouping information of nodes can be verified and confirmed by other nodes. In the grouping stage, the nodes can be divided into provers and verifiers, and the grouping information of each prover needs to be confirmed by the verifier. Therefore, the random grouping process is divided into two parts: the group formation stage and the group verification stage.
1) GST CENTER
The group status table (GST) is used to store the grouping status and information of each node, and some key information of GST is shown in Table 1. At the same time, the GST Center needs to be introduced to maintain the GST. The GST Center is a security service organization that all alliances and institutions recognize and rarely exhibit error behavior. The nodes fully trust all the information recorded by the GST Center. However, introducing the central node in the distributed network environment is not a good solution. In this paper, the GST Center is only used for information storage and does not participate in any grouping strategy. The introduction of the GST Center provides the following functions:
Provide a random seed to the prover and determine the grouping of nodes according to the random number
generated by the prover. Store the time each group formed.v Store the prover’s proof,
, group information, and timestampv for each group formed, and record basic information such as the ip address and pk.t When the identities of most nodes are exposed, a few nodes can be attacked. Therefore, we first do not publish the grouping information of each node, but rather the grouping information of the nodes is saved in the GST Center.
2) The Group Formation Stage
Assuming that there are
Provide a random seed to the prover and determine the grouping of nodes according to the random number
generated by the prover. Store the time each group formed. All consensus nodes in the blockchain network are numbered, and {1, 2, 3...v } is used to represent the node’s ID. The prover participating in the lottery is randomly selected from all nodes. The proverp obtains the random numberp and proof according to the VRF, and the generation process is as follows:v where seed is provided by the GST Center, num is the counter, the initial value is 0, and sk is the node’s private key.\begin{equation*} {VRF}[({seed},{num}),{sk}]\to {v}+{proof} \tag{1}\end{equation*} View Source\begin{equation*} {VRF}[({seed},{num}),{sk}]\to {v}+{proof} \tag{1}\end{equation*}
The prover
sends information such asp and proof to the GST Center. After verification, the GST Center will assign the prover to the corresponding group according to the random number valuev . When a group member reaches the threshold, the timestampv of the threshold is recorded. The position of each group in the STBFT algorithm is determined according to the timestamp. The purpose is to allow each group to be randomly assigned to the hierarchical system according to the timestamp.t When a group member has reached the threshold, if there is still a prover
assigned to this group, the proverp_{1} will not be added. The GST Center will provide a random seed for proverp_{1} again, and the value of num will be increased by one. Continue to calculate random numberp_{1} by seed and num until randomly assigned to the appropriate group.v
3) Group Verification Stage
When all nodes are randomly assigned to different groups at different levels, the grouping information of each node will be confirmed by all verifiers. The group verification stage is shown in Figure 3.
Step 1:
GST Center multicasts < broadcast,
(gst),m message to all verifiers including the prover, whered>_{\& g} (gst) contains random numberm , proof, timestampv , and group information,t is the digest ofd (gst), and >he subscript is the signature of the GST Center.m Step 2:
After receiving the broadcast message from the GST Center, the prover verifies whether it is valid and confirms its grouping information. At the same time, when all verifiers receive the broadcast message from the GST Center, they first verify the information of the verifier according to seed, v, proof, and pk in
(gst) and then confirm the grouping information of the verifier according to the group information, timestampm , and other information provided by the GST Center. If the broadcast message passes the verification, they will sign with pk and send the < response,t message to the prover.d>_{\&i} Step 3:
If the prover receives more than
matching response messages, it multicasts the < commit,2f message to all verifiers.d>_{\& s-list} represents the maximum number of Byzantine nodes that can be tolerated in the total network, andf , &n\ge 3f+{ }1 represents the set of signatures of all verifiers._{s-list} Step 4:
After receiving the commit message, all verifiers confirm whether the verifier has received more than
valid signatures. If the verification passes, it means that the prover has completed the grouping, all verifiers will send < reply2f messages to the GST Center and prover. The GST Center and the prover receiving more than>_{\& i} matching reply messages indicates that the whole system has agreed to the prover’s grouping information.2f
D. Feedback Mechanism
The basic idea of the STBFT algorithm is to reduce the communication complexity through the internal communication of each group and effectively improve the consensus efficiency. In Section B of III, we know that in the normal consensus process, if a node has reached consensus in the group domain, it will generate corresponding certificates and reply messages, send reply messages to high-level nodes, and send certificates to low-level nodes, these are important support for reaching consensus. However, the hierarchical structure based on the tree topology network will also face some problems. In the case where the high-level nodes are Byzantine nodes,
the lower-level nodes of the entire system cannot receive any correct messages, and frequent view change is needed; however, the cost of view change is expensive. Therefore, for the STBFT algorithm proposed in this paper, a feedback mechanism is designed to supervise and relay the behavior of nodes so that the system can minimize the influence of malicious nodes on the system when the nodes have Byzantine behavior, thereby improving the fault tolerance performance.
First, we define the feedback domain, which is a special autonomous region. As shown in Figure 4, node
Case
Under normal circumstances, each node in the group domain
(such as sending inconsistent messages or not sending messages within a specified time), at the consensus stage,
In addition, in the group domain
Case
When each node in group domain
Assume that node
Case
Since two consecutive nodes in the same branch are Byzantine nodes, in this case, the child nodes of node
Case
In this case, the feedback mechanism is ineffective in the feedback domain, so all branch nodes with node
Analysis and Simulation Experiments
A. Analysis of Failure Probability P of the Feedback Mechanism
In Case \begin{equation*} P(A)=C_{f_{1}}^{1} / C_{m}^{1} \tag{2}\end{equation*}
\begin{equation*} {P}({B / A})=\sum \limits _{c=f_{g} +1}^{k} {C_{f_{1} -1}^{c}} {/ C}_{m-1}^{k} \tag{3}\end{equation*}
\begin{equation*} {P=P}({A}) \times {P}({B/A})=({C}_{f_{1}}^{1} {/ C}_{m}^{1}) \cdot \left({\sum \limits _{c=f_{g} +1}^{k} {C_{f_{1} -1}^{c}} {/ C}_{m-1}^{k} }\right) \tag{4}\end{equation*}
B. Communication Complexity Analysis of the STBFT Algorithm
Assuming the number of nodes of the PBFT algorithm is \begin{equation*} {C}1=2{n}^{2}-{n +}1 \tag{5}\end{equation*}
prepare stage, commit stage, reply stage, and final agree stage. The communication complexity of each consensus domain is \begin{equation*} {C}2=[{n}(2{k}^{2}+1)] {/ k} \tag{6}\end{equation*}
\begin{equation*} {R}_{c} =[{n}(2{k}^{2}+1)]/[{k}(2{n}^{2}-{n}+1)] \tag{7}\end{equation*}
C. Fault Tolerance Analysis for the STBFT Algorithm
To improve the fault tolerance and security of the system, we use the randomness of VRF to ensure that nodes are evenly distributed in the consensus network. At the same time, our algorithm uses a feedback mechanism to deal with malicious behavior of Byzantine nodes, which can maximize the success rate of our algorithm in achieving consensus. Part A of Section IV shows that in the case of a random distribution of nodes, the probability that the feedback mechanism cannot be triggered within the acceptable fault tolerance range is extremely low. Therefore, we verify the effectiveness of the mechanism and analyze the success rate under malicious node attacks. By introducing the faulty probability determined (FPD) and faulty number determined (FND) models [38], our algorithm is compared with the Multi-Layer PBFT algorithm [38] to verify that it can improve the fault tolerance of the hierarchical system. The FPD model is used when the probability of every single faulty node is fixed, and the FND model is used when the number of faulty nodes in the system is fixed. We conduct FPD and FND simulation experiments on the two algorithms in MATLAB. Assuming that all nodes are independent of each other in the experiment, we use the same two-layer system structure for both algorithms and take the number of nodes
Analytical results of the FPD and FND models with different k and n values for the STBFT and Multi-Layer PBFT.
To further verify that our algorithm can significantly improve the system’s fault tolerance, we also conduct an experimental analysis of the two algorithms in the same three-layer system structure. The number of nodes
Conclusion
Due to problems such as high communication overhead and poor node scalability of the PBFT algorithm in consortium chains, this paper proposes a scalable algorithm based on a tree topology network structure that divides network nodes into groups and layers and changes from global consensus to hierarchical multi-centralized consensus, which reduces communication complexity and improves node scalability. At the same time, our algorithm considers the problems of low fault tolerance and weak security in the hierarchical system. Therefore, in response to the problems mentioned above, the STBFT algorithm ensures that the allocation of each node is random, preventing malicious nodes from colluding with each other. All nodes will verify and confirm the location information in the system. In addition, the primary nodes in each consensus domain do not participate in the consensus of the group domain but only send transactions and collect consensus results, reducing the communication burden of the primary nodes. We also specifically designed the system structure of the algorithm, which can perform mutual supervision between the upper and lower layers and provide timely feedback when nodes generate malicious behavior, allowing the system to take measures when nodes have Byzantine behavior to improve fault tolerance. Theoretical analysis and simulation results show that the STBFT algorithm significantly reduces the communication overhead, has higher fault-tolerant performance than the Multi-Layer PBFT algorithm, and improves the system’s security. However, when the number of faulty nodes in the system is fixed, the PBFT algorithm can tolerate one-third of the nodes, while the STBFT algorithm is far from fault tolerant, and many application scenarios not only require scalability but also have high requirements for delay. Therefore, future research includes solving the delay and fault tolerance problem in the hierarchical system so that the consortium chain can be applied to more severe and complex environments.