Author: Gerry Wang @ Arweave Oasis, originally published on @ArweaveOasis Twitter
This article will continue to interpret the Section 4 "Protocol Mechanism" in the 17th edition of the #Arweave white paper.
How to calculate the total number of partitions from network parameters?
By using the equation expression described above and some information provided by the Arweave network, we can calculate the total number of copies the network is storing. Every time a miner mines a new block, we can determine whether the solution hash of the block comes from the SPoA challenge of the first backtracking range or the second backtracking range. In a network that stores a full copy, this ratio is basically 1:1. However, if miners store incomplete data partitions or duplicate partitions (thus being penalized for efficiency), then this ratio will be less than 1.
We can calculate the average hash value for each partition by calculating the ratio of observed SPoAs. Assume that in the past 1,000 blocks, there are n1 first range SPoAs and n2 second range SPoAs. This means that the average copy integrity is n2/n1, so the mining efficiency of each partition is:
Formula Notes: In this formula, if the ratio of n1 to n2 is 1:1, then e_m is 1.
Using the above expression, we can accurately estimate the total number of partitions in the network. When the difficulty parameter is d, the expected value of the number of hash attempts is given by:
When the efficiency of each partition is only e_m, the expected number of partitions required to generate this many attempts in 120 seconds is:
Formula Notes:E[trials] is the total expected value of the number of hash attempts on the network, 800 is the maximum number of hashes per second for a partition, multiplied by e_m is the number of hashes at that mining efficiency, and multiplied by 120 is the total number of hashes at that mining efficiency in a mining cycle (usually about 2 minutes).
Considering the size of a partition is 3.6 TB, we can deduce the deployed storage capacity of the network:
All of these metrics about the stored data set and the average replica completeness can be calculated from the values observed in the network.
Incentives for Optimizing Data Routing
The mechanism of incentivizing miners to build complete replicas to improve mining efficiency will trigger a series of incentive mechanisms that are beneficial to the protocol. Among them, in order to quickly transmit data in the peer-to-peer network, miners will be prompted to develop optimized data routing solutions, which is a strong driving force for such complex and critical challenges. Because nodes must be able to quickly transmit any data block in the network, this requires maintaining reusable routing capabilities so that users and other miners can easily access data and improve data availability.
For miners, this new incentive to optimize data routing could create a competitive environment, similar to how Bitcoin miners competed to develop more efficient specialized mining hardware. This competition would promote innovation in routing infrastructure, ultimately leading to a more efficient and powerful distributed network.
Bandwidth Sharing Incentives
Another derivative effect of Arweave's mining incentive mechanism on storage replication is the strong necessity for miners to obtain data in the network. This creates a variety of market models for data access, including:
Karma and Optimistic Reciprocity:Nodes in the Arweave network jointly participate in a bandwidth sharing game similar to BitTorrent. In this game, nodes share data with each other. In addition, nodes occasionally share data randomly, optimistically expecting future rewards. Each node maintains its own peer rankings without reporting how or why these rankings were determined. Such mechanisms have been very successful in data sharing platforms such as BitTorrent, which once accounted for about 27% of the world's Internet traffic.
Benefits from physical disk distribution:Node operators can directly buy or sell physical disks storing weave data in exchange for money or other forms of payment. For miners with limited bandwidth, this may be a more desirable option given the large amount of data required to run an Arweave node. This method of transmission bypasses traditional packet filters and firewalls. In fact, downloading raw data is indeed a threshold that many new miners need to overcome. As the amount of data on the entire network gradually increases, this form of data acquisition channel will be more convenient and efficient.
Payment Protocol:Nodes can also participate in protocols and markets that allow them to pay when accessing data. The Permaweb Payment Protocol (P3) provides such a way, using payment channels to incentivize a variety of services within Arweave (including simple data access).
Scalability
The average time for Arweave to create a block is about 2 minutes, and each block contains up to 1,000 transactions. This limit ensures that block verification and synchronization remain extremely lightweight, allowing the network to be widely decentralized. However, this transaction limit does not mean that there is any limit on the size or amount of data stored in a given block, because Arweave uses a mechanism called "bundling". Bundling is a network-wide standard (standard number #ANS104) built on the core protocol for combining many different data items into a single transaction. These data items are functionally equivalent to top-level data storage transactions on the network, because bundled transactions can be "unbundled" into their constituent items when retrieved.
Arweave has a maximum transaction size of 2^{256}-1 bytes, which can be divided into any number of separate data items in potential recursive bundling. This allows the network's throughput to scale without practical limits. This optimization is possible because data uploads on Arweave are not parameterized - every byte on the network is part of the same global Merkleized data set and is backed by a shared storage endowment. An element of this design is the aggregation of payments from individual data items to upload bundles. Users can choose to combine their data item payments in a packaging transaction, or move the payment completely off-chain, where a packaging service provider will combine their data items with those of other users.
Figure 1: Packaging allows data to be passed up and stacked into a top-level transaction.
In Arweave, all transactions are selected for inclusion in the 1,000 slots of each block based on their total value, as the inclusion fee earned by miners is proportional to the transaction fee. This incentivizes packaging services to combine transactions in a recursive manner, increasing the scalability of the network when block space is scarce. Therefore, any number of packagers and users can write data to the network at any given time, without incurring a block space auction mechanism like other blockchains. In addition, the competition between packagers to build larger transactions will put downward pressure on the final fee costs for users. This is in stark contrast to other blockchains, where competition for limited block space is very fierce, resulting in increasing fees for users, and eventually forcing some users to stop using the network due to high fees.
Figure 2: The preference for larger data packets incentivizes packagers to recursively package data to minimize fee costs
Users can also upload data through off-chain packaging service providers, which has the advantage that users can pay for Arweave storage through any payment method supported by the packaging service provider, and the packager uses AR to settle the grouped data. As of now, the Arweave network supports at least 18 different payment methods through packaging services.