Introduction
Since the birth to Bitcoin [1], blockchain technology has attracted great attention on both academic and industry for its trustworthy, tamper-proof, traceable, and decentralized. People began to explore the application mode of blockchain. As an example, Ethereum [2] even can run apps written in Turing-complete languages. Blockchain is a P2P distributed ledger technology, which maintains a global ledger through each node. However, to prevent malicious tampering with accounts, data redundancy has to be infinitely expanded. As a result, the storage cost of nodes is rising. Taking Bitcoin as an example, the size of a block data is about 1MB, and a new block is created every 10 min. Each individual node increases storage costs nearly 52GB per year.
In addition, fast transactions are a general trend to increase system throughput and many recent schemes have been proposed [3]–[5]. When the transaction throughput of a blockchain system increases to the level of VISA, the blockchain system needs to increase the storage cost of 214PB each year [6]. Therefore, with the development of the blockchain, the storage cost will become a problem that cannot be ignored.
Pruning data and sharding storage are popular candidates to reduce data redundancy in blockchains. Pruning data may lose the traceability of the blockchain. A blockchain system without traceability will lose many application scenarios, such as auditing, product traceability [7]–[9], and so on. Sharding storage is to disperse store data in different storage sharding to reduce redundancy. The sharding blockchain system modifies the protocol of the traditional blockchain. Therefore, different committees need to constantly interact to verify new transactions. It can reduce redundancy, but changes the underlying storage and verification modes. Modifying the original protocol of an existing blockchain system requires community consensus, which requires a lot of work and risks ecological damage. Therefore, this paper tries to solve the storage redundancy problem without modifying the existing blockchain system.
Here, we propose GCBlock: a scalable and tolerant disperse storage scheme by leveraging grouping and a new created network coding architecture. The grouping idea is somewhat similar to the existing sharding system [4], but the application model and structure are different. First, grouping is performed on the overlay network layer and will not change the existing underlying architecture of the blockchain. Our approach is only to provide an option to reduce storage cost in existing blockchain systems. It differs from the sharding blockchain system which changes the protocols and architecture of the most existing system. Then, in terms of disperse data, the sharding blockchain system needs to store data instantly, while we only disperse the data after the transaction consensus is reached. Because the data after the consensus is not directly used to verify new transactions, it will not break the underlying running mechanism. For example, in Bitcoin, new transactions are verified by a copy of UTXO [10]. In addition, GCBlock is not a blockchain protocol, but a scheme to reduce storage for existing blockchain systems. The differences between GCBlock and blockchain protocols are shown in Table 1.
To reduce the communication delay between nodes within a group, we group based on fuzzy distance. To further enhance stability and reduce redundancy, we construct the transcript fractional repetition (TFR) code, which is a new code scheme using erasure code within a group. For the blockchain system is a complete P2P network and the join and departure of nodes is uncertain, traditional erasure code schemes cannot handle this instability very well. TFR code is based on fractional repetition (FR) code [11] to add layers, which will enhance the fault tolerance of the system. In addition, because the nodes in the blockchain network are not always benign, our distributed storage scheme may also suffer attacks from malicious nodes. To address these challenges, we introduce a way to independently detect malicious and selfish behaviors within the group. Malicious and selfish peers can be removed from the group by a node’s autonomous check. Specifically, our contributions can be summarized as follows:
In this paper, we present a scheme to reduce the storage cost for the existing blockchain system by leveraging grouping overlay network and a new network code architecture.
To shorten the communication delay when tracing, the grouping criteria is based on fuzzy distance. The grouping is over overlay network which can keep the original protocols of blockchain systems.
We construct a new coding scheme - TFR code which aims to adapt the scalability of the number of nodes in the group. To further enhance system stability, we propose countermeasures to identify evil behaviors and nodes within the group.
We evaluate our scheme comprehensively. The results show that the communication overhead and system latency are reasonable.
Related Work
Recently, some researchers have focused on pruning old data and designing new system architectures to reduce blockchain data redundancy. In the Bitcoin white paper [1], Nakamoto has proposed “restoring disk space” and “simplifying payment verification (SPV)”. SPV nodes need to rely on other nodes to verify transactions, which reduces security. Restoring disk space is to prune old transaction data. Similarly, in sharding blockchain system, E. Kokoris-Kogias et al. [5] also chose the way of pruning old data to reduce data redundancy. However, with the pruning of transaction data, the blockchain also loses the properties of data traceability. R. K. Raman et al. [12] designed a new blockchain that uses Shamir secret sharing coding schemes to allow each node to store only a portion of the data to reduce storage costs. M. Zamani et al. [4] designed a new sharing blockchain protocol and proposed a sharding storage scheme that allows each committee to store a portion of the data to reduce data redundancy. Our scheme is different from the above. We do not change the original blockchain system, but only provide a solution to solve data redundancy for the existing blockchain system. We propose to add an overlay layer for the existing blockchain network, allowing the nodes to choose whether to join the grouping overlay network. Nodes joining a grouping rely on data encoding to reduce data redundancy.
Z. Xu et al. [13] proposed a scheme to reduce data redundancy in the private/permissioned blockchain by assigning blocks in a Consensus Unit. In their scheme, the Consensus Unit comprises a group of nodes which work together and just need to maintain at least an entire copy of Blockchain data. This may be impractical in public blockchain for nodes in public blockchain are independent of each other and the assignments may be unequal for both storage costs and query costs. Besides, they should estimate the communication costs and the frequently access set periodically and adjust to reassign the blocks which may be unstable and can be affected by multiple factors. In this paper, we focus on the redundancy problem in the public blockchain rather than the private/permissioned blockchain. We use an erasure code scheme to reduce node redundancy instead of the replication scheme. It has been demonstrated [14] that erasure coding has greater advantages in terms of fault tolerance, lower bandwidth and lower storage.
Preliminaries
In this section, we define the basic structure model, the threat model and on that basis present our design goals.
A. Structure Model
To show our scheme more clearly, we draw the original blockchain network model and grouping overlay network model as shown in Figure 1.
Original Blockchain Network: A common blockchain network, such as Ethereum, Bitcoin, etc.
Full ledgers: The global ledger data of the blockchain includes the data of reached consensus and the data of waiting for consensus. The full ledgers are stored synchronously in each node.
All nodes: All nodes that store the full ledgers (e.g., the full node of Bitcoin). All nodes can choose whether to join the grouping overlay network.
Grouping Overlay Network: A grouping network built on top of the original blockchain network.
Data of reached consensus: The data of reached consensus cannot be changed. (e.g., Bitcoin transactions that have been confirmed after 6 blocks are almost impossible to tamper with.) GCBlock only operates on the data after the consensus is reached.
Network coding: Erasure code is used to encode block data and store them in groups. Our coding scheme is described in Section IV.
Groupi: Groups formed by nodes that have joined the overlay network. Specific grouping schemes will be described in Section IV.
Next, we briefly describe how nodes reduce storage costs by joining the grouping overlay network. When a node joins the GCBlock, the node preferentially joins the closest group. After successfully joining the group, it accepts encoded fragments within a group and deletes data of reached consensus stored by itself. In this way, the node only stores a part of the encoded fragments, which reduces the storage costs. When data needs to be recovered, the node can download enough encoded fragments from peer nodes within the group to decode recovery.
B. Threat Model
Since the GCBlock only provides a service to reduce data redundancy of the original blockchain, we do not consider the case where the blockchain system is compromised. In addition, we also do not consider irrational large-scale attacks.
In the threat model, we mainly consider the selfish behavior of nodes and the malicious behavior of nodes. The selfish behavior of nodes is mainly free-riding. Free-riding means that the node does not store any encoded fragments and gets encoded fragments from other nodes in the group when needed. The malicious behavior of a node is that the node provides spurious encoded fragments to peers within the group. Sometimes, malicious behavior is a disguise of selfish behavior. For example, selfish nodes provide spurious encoded fragments for others. We collectively refer to selfish behavior and malicious behavior as evil behavior.
C. Design Goals
The main purpose of GCBlock is to reduce the storage costs of blockchain systems. As we all know, the blockchain network is a P2P network. Therefore, we considered the characteristics of P2P networks to design GCBlock. Our design goals are:
High tolerance: In a P2P network, nodes may quit and join at any time. Our coding scheme must be highly flexible to accommodate the constant quitting and joining of nodes while ensuring extremely low data recovery failure rates.
Security: For the evil behaviors proposed by the threat model, we need to formulate strategies to remove the evil nodes from the group.
Efficiency: Nodes within a group should be able to transfer data with low latency.
Proposed Scheme
A. Overlay Grouping Network
An overlay network is a virtual network built on top of an existing network through logical links of nodes, designed to extend services that are not available on existing networks [15]. There is no need to changes the underlying network, overlay network looks like a normal user-level application. Constructing GCBlock in the overlay layer can maintain the running mechanism and architecture of the original blockchain system. There have been many researches on the application of overlay network in P2P network, which can be divided into structured overlay network [16], [17] and unstructured overlay network [18], [19]. A structured overlay network maintains a specific structure that enables reliable routing between a limited number of nodes. Compared with the unstructured overlay network, the nodes of the structured overlay network can choose the neighbors that meet certain constraints in a set of nodes.
To group nodes based on delay is an effective approach for organizing and managing nodes that join GCBlock. The delay between nodes of the same group is less than that between nodes of different groups, to reduce network traffic and improve routing efficiency. Usually, the delay between nodes is proportional to their physical distance. Therefore, we can construct a structured overlay network based on grouping according to the underlying physical distance. In addition, the physical scope within the grouping should be reduced and the upper bound of this range should be specified to meet the requirements between nodes.
1) Beginning Grouping
The algorithm to group nodes by geographical location on overlay network is usually called landmark clustering. In the grouping algorithm, we need to select the landmark node and then optimize the topology based on the landmark node. It is an NP-hard problem to find landmark node that can minimize the total communication delay in the group. Therefore, the algorithm for beginning grouping should be able to achieve our requirements approximately and get a grouping topology. In the beginning state, GCBlock needs a bootstrap node. The bootstrap node is only for boot system startup and will stop working after system startup.
The bootstrap node asks joining nodes for routing messages, and then obtains a similar distance among the nodes according to count the IP (The IP in this paper is the IP address.) hop in route messages. Therefore, we can easily get a similar distance matrix. The network nodes grouping also becomes points grouping in the matrix. This idea of network grouping is called fuzzy clustering [20], [21] and has been used in many fields [22]–[24].
For the uncertainty of the number of groups, the hard clustering algorithm that directly specifies the number of groups is not suitable. Thus we introduce the ISODATA algorithm [25], [26] and make adaptive modifications for network grouping in Algorithm 1. To implement the algorithm, we need to set the following parameters:
–The number of centers that were selected randomly;s –The minimum distance between groups;\text{D}_{min} –The distance between groups;D_{center} –The maximum variance within the group;\sigma _{max} –The minimum number of nodes in a group;n – The variance of nodes within a group;\sigma _\lambda –The number of nodes in a group;Sum[G_{i}] –The number of iterations;v
Algorithm 1 Beginning Grouping
Nodes set
The result of initial grouping
Step 1:
Randomly selected
Step 2:
for each
computing
Add
end for
Step 3:
Recomputing the number of each group Sum
while
Get rid of the
end while
Step 4:
Updating the grouping centers
Computing the variance within the group
if
Split into two different groups
end if
Step 5:
Computing the distance between centers-
if
Merge
end if
Step 6:
if
else
end if
In order not to affect the nature of the algorithm, the delay between nodes is represented by the Euclidean distance. In the first step, the algorithm randomly selects
After outputting the grouping result, the bootstrap node will send the number of a corresponding group and the IP list of group members to nodes and complete the beginning grouping phase. For the randomness of the algorithm, the optimal result cannot be guaranteed. Some nodes that too scattered may not join a group during the beginning grouping phase. Therefore, these nodes need to wait for new nodes to join and form new groups.
Remark:
In cases where regional faults (network fault and power fault) need to be considered, our scheme can be easily extended to a simple grouping way in which nodes are randomly divided into groups. However, this will lose the efficiency of transferring data among nodes. Actually, our scheme can provide alternative node-dividing methods to be compatible with the different requirements.
2) Duty Node
For out of group communication and intragroup tasks, each group needs to select a duty node at every T interval. The duty node is selected by random voting within the group.
a: Selection Rules
Scope: Each node randomly selects a node from the 1/3 nodes that joined the group for the longest time. Method: Each selection result will be broadcast within the group, the node with the highest number of votes becomes the duty node. Therefore, the longer the joining time, the easier it is to be selected as the duty node. We also believe that the stability and reliability of the node with a long joining time are higher. Balance: When a node is selected as the duty node, its join time will be reset. This ensures a more balanced chance of becoming a duty node.
When the duty node does not send the encoded fragments within M time, the group members will remove the IP of duty node from the IP list of group members and reselect a new duty node.
b: Responsibility
During the tenure, the duty node is responsible for encoding the data and sending it to the peers within the group according to encode rules. The duty node is also responsible for checking malicious and selfish behavior and alerting peers within the group. Additionally, the duty node is also responsible for the tasks of contacting new nodes and checking nodes exit.
c: Deal With Evil Behavior
When the node finds malicious or selfish behaviors, it can remove the evil nodes from the IP list of group members and report these behaviors to the duty node. The duty node will verify the report. If an evil behavior is verified, it will broadcast the alerting signal to the peers within the group. The peers self-verify evil behaviors by the alerting signal, and remove the node from the IP list of the group members when the evil nodes are identified. All IP of evil nodes will be added to a blacklist of the group by the peers, and the IP in the blacklist will be forbidden to join the group.
d: Identify Evil Node
After encoding, the duty node hashes the encoded fragments, forms a Merkle tree structure and broadcasts it to all nodes in the group. When decoding is required, the node can verify the obtained encoded fragments according to the Merkle tree, and can quickly identify the source of the pollution. As shown in Figure 2, a schematic diagram for the generation of a Merkle hash tree with eight encoded fragments.
An eight-leaf Merkle hash tree. The leaves of a tree are hashes of encoded fragments. The Merkle hash tree is formed by recursively hashing the hashes of the leaves.
Remark:
The duty node does not have the authority to remove other nodes from the group. Removing an evil node from the IP list of the group members are nodes’ self-behavior. When most nodes within a group remove the evil node from the IP list of the group members, the evil node cannot get data from these nodes within the group. In this way, the evil node is removed from the group.
3) Joining and Exiting
The joining and exiting of nodes in a P2P network are unpredictable. For joining and exiting nodes, we give the following scheme:
New node joining: When a new node joins the network, it sends a notification message to the entire network. The duty nodes reply with the confirmation message to the new node, indicating that the new node has been known to join the overlay network. The new node estimates the distance according to the delay of the confirmation message, and then requests to join the group with the closest distance. The process of request to join a group is shown in Figure 3.
Describe the process:
The new node sends a request join message to the closest duty node.
After receiving the request message, the duty node sends a test message to the new node.
When the new node receiving the test message, it immediately sends the request test message.
After receives the request test message, the duty node estimates the distance of the new node according to the delay. The duty node sends an accept message to the new node when the distance of the new node does not exceed the required maximum distance within the group, otherwise sending a refuse message. The distance within a group is estimated by communication delay.
Node Exiting: Through the heartbeat mechanism, the duty node can detect the node exit and broadcast a message within the group. The exit node will be removed from the group and the IP of the node will be added to a limit list. The limit list is designed to prevent the intermittent offline behavior of the nodes. If the node exits, it will be restricted to rejoin for a long time.
B. Network Coding
The joining and exiting of the nodes is uncertain in the blockchain network. Therefore, our coding scheme has to meet the following two requirements:
It is able to adapt to the constant changes in the number of nodes within the group.
It has the ability to quickly repair the fault nodes and higher fault tolerance.
In faulty node repair, most erasure code systems use coding repair. Surviving nodes and new nodes need to calculate a linear combination of stored symbols for regeneration, which undoubtedly adds additional computational overhead. To reduce the delay of repairing failed nodes, our scheme adopts the idea of the FR code. FR code is a non-coding repair based on a table, which reduces the complexity of additional coding repair than the regenerating code [27].
However, once the parameters of FR code are set, the number of nodes are fixed. This cannot adapt to the constant changes in the number of nodes in the blockchain network. To meet the change in the number of nodes, we propose the transcript fractional repetition code which is newly constructed based on the FR code.
1) Review Fractional Repetition Codes
The FR code is constructed based on the Maximum Distance Separable (MDS) [28] code. The data is encoded into
Here we need to describe the MDS code: The MDS code with the parameter
FR code can be described as follows: The MDS encoded fragments \begin{equation*} \theta \rho = nd\tag{1}\end{equation*}
An example of a simple FR code. First, a data block is divided into six equal pieces, and then encoded into eight equal pieces by a MDS code (8,6). Finally, these encoded fragments are repeatedly placed to form an FR code with a repetition degree
2) Transcript Fractional Repetition Codes
To meet the fluctuation of the number of nodes, we constructed the TFR code. The TFR code extends the flexibility of the FR code by adding replication levels to cope with the fluctuation of the number of nodes within the group.
Construct the TFR code: First construct an initial FR code as level 0 and then add levels based on the level 0. All levels form a level index table. The level index table is dynamic and changes as nodes within the group change. Each complete level is a copy of the initial FR code. In Figure 5, an example of the TFR code, the initial FR code with n = 6 is shown{
A example of TFR code. TFR code is formed by adding levels of replication to the basis of FR code.
The encoding process: The duty node encodes the block data using the MDS code, and performs encoding fragments distribution according to the level index table of the TFR code. For example, send a set of encoded fragments
Verification encode fragments: After receiving the encoded fragments, the nodes perform to decode verification. After determining the trueness of the data encode, the nodes delete the original block data to save the encoded fragments.
Repair rules: Priority to repair the node of the upper level. We specify that the level 0 has the highest priority, so the integrity of level 0 needs to be maintained in priority. Repair the fault node of a higher level, first find a copy of the fault node from the lowest level and move it to the higher level of the level index table. If the lowest level does not have a copy of this failed node, it executed up one level. In the worst case, the fault node has no copy and needs to wait for the new node to join to repair it.
Restore data: A node should preferentially obtain the encoded fragments from the nodes of the same level for decoding. If the encode fragments of the same level are not enough, then to access the upper level to get the encode fragments.
Dynamic level index tables:
Remove node: When a node is removed by the IP list of the group members, the node will be removed from the level index table. After the node is removed, repair it according to the repair rules.
Add node: When a new node joins a group, the level index table needs to be updated. First, the new node gets the level index table of the group from the duty node. Then, according to the level index table and repair rules, the new node repairs the fault node. If all levels are complete, the new node will form a new level. After the new node updates the level index table, it will broadcast the content of update within the group. When the update is received, the members of the group will update the level index table.
Security Analysis
We analyze GCBlock security in terms of grouping security and coding availability.
A. Grouping Security
GCBlock serves the public blockchain, so the data is public and there are no access control security issues. GCBlock is essentially a mutually beneficial storage solution. GCBlock is essentially a mutually beneficial storage solution. Due to the GCBlock is built on the overlay, we do not consider the case where the underlying layer is being attacked on a large scale. Implementing a large-scale attack against GCBlock does not bring any direct benefits, and the attack cost are comparable to attack the original blockchain (for example, 51% attack).
We will discuss the evil behavior presented in the threat model. The nodes that join the group are only interested in reducing storage. According to the game, we represent the reduced storage overhead as profit. So, the stable profit of per node within the group are
(remove, evil): When
removes the evil node from the IP list of the group members,P_{b} makes a choice for the stability of the group and will continue to maintain the profitP_{b} . For(k-d)L/k , will lose the profitP_{a} when it is removed.(k-d)L/k (remove, not evil): When
removes a not evil node from the IP list of the group members, it reduces the chance that gets the encoded fragments. This biggest loss may beP_{b} . This is an act that does harm to others but no good to itself.(k-d)L/k (not remove, evil): If
does not remove the evil node from the IP list of the group members, this will affect the stability of the group and cause a loss to itself. The biggest loss isP_{b} . And then the profit of the evil node is(k-d)L/k .L (not remove, not evil): When there are no evil nodes,
does not remove any nodes from the IP list of the group members. In this state, all nodes maintain the profitP_{b} .(k-d)L/k
The wisest choice of nodes is not to do evil, so as to ensure the balance of interests of all parties. And the honest node does not remove the non-evil node from the IP list of the group members, so that both parties do not lose. When a node is evil, other nodes will remove it from the IP list of the group members and add it to the blacklist to ensure maximum benefit. Therefore, this will reach a state of Nash equilibrium.
B. Coding Availability
In [11], we can know that the FR code can maintain the uncoded repair feature when
Performance Evaluation
We evaluate the performance of the GCBlock based on the effect of grouping and the performance of erasure code. All of our experiments were performed on a Lenovo ThinkStation P910 with two Intel Xeon E5-2620 v4@2.1GHz and 64G DDR4 RAM installed.
A. Grouping Implementation
In order to get closer to the grouping effect of the original network, we join the Bitcoin network to collect the IP addresses by DNS seed [29]. The brief process: To get the IP address from the DNS seed and put it into the seed collection of IP. To create an epoll, constantly extracting IP from the seed collection to establish a TCP connection, and sniff the read and write events. As long as the addr message(a message containing neighbors’ IP addresses) is received, the IP addresses are extracted into the IP seed set, and thus loops to collect the IP address in the network. The IP addresses between neighbor nodes are set one distance delay. We collected a total of 27403 IP addresses (including full nodes and light nodes). We extracted 1000, 1500, 2000, 2500, 3000, 3500, and 4000 IP addresses respectively, and constructed the adjacency matrix and distance matrix of these IP addresses to implement the beginning grouping algorithm. We set the initial selection
The average distance within the group and the average distance between adjacent groups according to the IP hops with the different number of bitcoin network nodes.
B. The Performance of the Erasure Code
1) Encoding and Decoding
In order to test the performance of the TFR code, we use the Reed-Solomon code which is a good quality MDS code. Encoding is essentially a process of calculating the parity-check fragments. As the number of parity-check fragments increases, the complexity of encoding will increase. Decoding complexity depends on the finite field of the encoded fragments. The larger the finite field, the greater the complexity of decoding. We study the Backblaze [30] and implement Reed-Solomon encoding and decoding with AVX 2.0 instructions and Golang on two Intel Xeon E5-2620 v4@2.1GHz. We tested the encoding and decoding performance of a 1MB block size. The block size we chose to test is based on the standards of Bitcoin [1] and Ethereum [2]. Figure 8 and 9 show the speed of encoding and decoding on different
In order to test the efficiency of Merkel hash tree generation, we implemented the Merkel hash tree generation process using Golang. We analyzed the generation efficiency of Merkel hash trees with different data sizes and different numbers of encoded fragments. In Figure 10, we can see that the generation of merkel hash tree is extremely efficient, which means that the resource consumption is extremely small. We can also see that when the number of encoded fragments is constant, the larger the data, the slower the Merkel hash tree is generated. When the data size is constant, the number of encoded fragments increases, and the slower the Merkel hash tree is generated. If a system needs to consider reducing the resources consumption of generating Merkle hash tree, we can set a smaller
The Merkle hash tree generation speed with different data sizes and different number of encoded fragments.
2) Evaluate the Bandwidth Consumption
The encoded fragments in the TFR are multicast transmitted according to the levels index table, so that the duty node transmits encoded fragments only n times for nodes in different levels. When encoding a Y size data, the bandwidth consumption by the duty node transmits the encoded fragments is
3) Storage Overhead Reduction Rate (Sorr)
The storage overhead reduction rate of each node is
Conclusion
This paper presents the design of the GCBlock – an open scheme for reducing the nodes storage overhead of blockchain. GCBlock is designed to provide an option to reduce storage costs for existing public blockchain systems. Each node that joins the GCBlock can effectively reduce storage costs by implementing an erasure code scheme within the group. In order to reduce the delay of nodes acquiring data, we try to use the idea of fuzzy distance clustering to group. Within the group, we use a reconstructed TFR code that satisfies the dynamic scalability of the number of nodes to reduce the storage overhead of the nodes. In order to prevent evil behavior, we develop an autonomous check strategy and prove that it satisfies the Nash equilibrium. We have conducted extensive performance evaluation and showed that the scheme is feasible and reasonable.
Since the GCBlock does not modify the original architecture of blockchain, it can be easily used in existing public blockchain systems as a scheme to reduce data redundancy. In addition, the number of data transmission of nodes to serve peers are different in the scheme. It is an interesting topic for nodes to actively serve peers. In the future work, we will carry on the related incentive mechanism research.