Author: Chakra; Compiler: 0xjs@Golden Finance
This article is Part III of Chakra's Bitcoin scalability series.
For the first part, please refer to Jinse Finance's previous article "A Review of Bitcoin's Native Scaling Solutions: SegWit and Taproot",
For the second part, please refer to Jinse Finance's previous article "Bitcoin Scalability: Layer2 Solutions and Related Project Analysis".
The third part is this article, as follows:
Overview
Compared to Turing-complete blockchains such as Ethereum, Bitcoin scripts are considered to be extremely restrictive, capable of only basic operations, and do not even support multiplication and division. More importantly, the blockchain's own data is almost inaccessible to scripts, resulting in a serious lack of flexibility and programmability. Therefore, there has been an effort to enable introspection in Bitcoin scripts.
Introspection refers to the ability of Bitcoin scripts to inspect and constrain transaction data. This allows scripts to control the use of funds based on specific transaction details, thereby implementing more complex functions. Currently, most Bitcoin opcodes either push user-provided data onto the stack or manipulate existing data on the stack. However, the introspection opcode can push data from the current transaction (such as timestamp, amount, txid, etc.) onto the stack, allowing for more granular control of UTXO spending.
As of now, only three main opcodes in Bitcoin script support introspection: CHECKLOCKTIMEVERIFY, CHECKSEQUENCEVERIFY, and CHECKSIG, as well as their variants CHECKSIGVERIFY, CHECKSIGADD, CHECKMULTISIG, and CHECKMULTISIGVERIFY.
Covenant, in simple terms, refers to restrictions on how money is transferred, allowing users to specify how UTXOs are allocated. Many contracts are implemented through the introspection opcode, and discussions about introspection have now been categorized under the Bitcoin Optech Contracts topic.
Bitcoin currently has two contracts, CSV (CheckSequenceVerify) and CLTV (CheckLockTimeVerify), both of which are time-based contracts and are the basis for many scaling solutions (such as the Lightning Network). This shows that Bitcoin's scaling solutions rely heavily on introspection and contracts.
How do we add conditions to the transfer of tokens? In the field of cryptocurrency, our most common method is through commitments, usually implemented through hashes. In order to prove that the transfer requirements are met, a signature mechanism is also required for verification. Therefore, there are many adjustments to hashes and signatures in contracts.
Below, we will describe the widely discussed Covenant opcode proposal.
CTV (CheckTemplateVerify) BIP-119
CTV (CheckTemplateVerify) is a Bitcoin upgrade proposal in BIP-119, which has attracted widespread attention from the community. CTV allows output scripts to specify templates for spending funds in transactions, including fields such as nVersion, nLockTime, scriptSig hash, input count, sequence hash, output count, output hash, input index, etc. These template restrictions are implemented through hash commitments, and when funds are spent, the script checks whether the hash of the specified field in the spending transaction matches the hash in the input script. This effectively limits the time, method, and amount of future transactions for that UTXO.
It is worth noting that the TXID of the input is excluded from this hash. This exclusion is necessary because in traditional and segregated witness transactions, when the default SIGHASH_ALL signature type is used, the TXID depends on the value in the scriptPubKey. Including the TXID will cause a circular dependency in the hash commitment, which cannot be constructed.
CTV's introspection method is to directly pull the specified transaction information for hashing and then compare it with the commitment on the stack. This introspection method has low requirements for chain space, but lacks certain flexibility.
The basis of Bitcoin's second-layer solutions (such as the Lightning Network) is pre-signed transactions. Pre-signing generally refers to generating and signing transactions in advance, but not broadcasting them until certain conditions are met. In essence, CTV implements a stricter form of pre-signing, publishing pre-signed commitments on the chain and restricting them to predefined templates.
CTV was originally proposed to alleviate Bitcoin's congestion, which can also be called congestion control. In times of severe congestion, CTV can commit to multiple future transactions in a single transaction, avoid broadcasting multiple transactions during peak hours, and complete the actual transaction after the congestion eases. This may be particularly useful during exchange runs. In addition, the template can also be used to implement Vault to prevent hacker attacks. Since the flow of funds is predetermined, hackers cannot use CTV scripts to point UTXOs to their own addresses.
CTV can significantly enhance the second-layer network. For example, in the Lightning Network, CTV can create timeout trees and channel factories by expanding a single UTXO into a CTV tree, opening multiple state channels with only one transaction and one confirmation. In addition, CTV also supports atomic transactions in the Ark protocol through ATLC.
APO (SIGHASH_ANYPREVOUT) BIP-118
BIP-118 introduces a new type of signature hash flag for tapscript, designed to facilitate more flexible spending logic, called SIGHASH_ANYPREVOUT. APO and CTV have many similarities. When solving the loop problem between scriptPubKeys and TXID, APO's approach is to exclude relevant input information and sign only the output, allowing transactions to be dynamically bound to different UTXOs.
Logically, the signature verification operation OP_CHECKSIG (and its variants) performs three functions:
1. Assemble the parts of the spending transaction.
2. Hash them.
3. Verify that the hash has been signed by a given key.
The specific details of the signature are very flexible, and the SIGHASH flag determines which fields of the transaction are signed. According to the definition of the signature opcode in BIP 342, the SIGHASH flags are divided into SIGHASH_ALL, SIGHASH_NONE, SIGHASH_SINGLE, and SIGHASH_ANYONECANPAY. SIGHASH_ANYONECANPAY controls the input, while the others control the output.
SIGHASH_ALL is the default SIGHASH flag, signing all outputs; SIGHASH_NONE does not sign any output; SIGHASH_SINGLE signs a specific output. SIGHASH_ANYONECANPAY can be set together with the first three SIGHASH flags. If SIGHASH_ANYONECANPAY is set, only the specified input is signed; otherwise, all inputs must be signed.
Obviously, these SIGHASH flags cannot eliminate the impact of the input, even SIGHASH_ANYONECANPAY, it requires a commitment to an input.
Therefore, BIP 118 proposed SIGHASH_ANYPREVOUT. APO signatures do not require commitment of spent input UTXOs (called PREVOUTs), but only need to sign outputs, providing greater flexibility for Bitcoin control. By pre-building transactions and creating corresponding one-time signatures and public keys, assets sent to that public key address must be spent through pre-built transactions, thus implementing contracts. APO's flexibility can also be used for transaction repair; if a transaction is stuck on the chain because the fee is too low, another transaction can be easily created to increase the fee without the need for a new signature. In addition, for multi-signature wallets, not relying on spent inputs makes operations more convenient.
Because the cycle between scriptPubKeys and input TXIDs is eliminated, APO can perform introspection by adding output data in Witness, although this still requires additional Witness space consumption.
For off-chain protocols such as Lightning Network and Vaults, APO reduces the need to save intermediate states, greatly reducing storage requirements and complexity. The most immediate use case for APO is Eltoo, which simplifies channel factories, builds lightweight and cheap watchtowers, and allows unilateral exits without leaving error states, thereby enhancing the performance of the Lightning Network in many ways. APO can also be used to emulate CTV functionality, although it requires individuals to store signatures and pre-sign transactions, which is more expensive and less efficient than CTV.
The main criticism of APO focuses on the fact that it requires a new key version, which cannot be achieved through simple backward compatibility. In addition, the new signature hash type may bring the potential risk of double spending. After extensive community discussion, APO added regular signatures on top of the original signature mechanism to alleviate security concerns, resulting in the BIP-118 code.
OP_VAULT BIP-345
BIP-345 proposes to add two new opcodes, OP_VAULT and OP_VAULT_RECOVER, which, when combined with CTV, can implement specialized contracts that allow users to force a delay in spending a specific currency. During this delay, previously made transactions can be “undone” via the recovery path.
A user can create a Vault by creating a specific Taproot address, which must contain at least two scripts in its MAST: one with the OP_VAULT opcode to facilitate the intended withdrawal process, and another with the OP_VAULT_RECOVER opcode to ensure that tokens can be recovered at any time before the withdrawal is completed.
How does OP_VAULT achieve interruptible time-locked withdrawals? OP_VAULT does this by replacing the used OP_VAULT script with the specified script, effectively updating a single leaf of the MAST while leaving the rest of the Taproot leaf nodes unchanged. This design is similar to TLUV, except that OP_VAULT does not support updates to internal keys.
By introducing templates during the script update process, payments can be restricted. The timelock parameter is specified by OP_VAULT, and the template of the CTV opcode restricts the set of outputs that can be spent through this script path.
Designed specifically for Vaults, BIP-345 leverages OP_VAULT and OP_VAULT_RECOVER to provide users with a secure custody method using highly secure keys (such as paper wallets or distributed multi-signatures) as a recovery path while configuring a certain delay for periodic payments. The user's device continuously monitors the vault's expenditures, and if an unexpected transfer occurs, the user can initiate a recovery.
There are cost considerations for implementing Vaults through BIP-345, especially for recovery transactions. Possible solutions include CPFP (child nodes pay for parent nodes), temporary anchors, and the new SIGHASH_GROUP signature hash flag.
TLUV (TapleafUpdateVerify)
The TLUV scheme is built around Taproot and aims to effectively solve the shared UTXO exit problem. The guiding principle is that when the Taproot output is spent, the internal keys and MAST (tapscript trie) can be partially updated through cryptographic transformations and the internal structure of the Taproot address, as described in the TLUV script. This makes the implementation of the Covenant function possible.
The concept of the TLUV scheme is to create a new Taproot address based on the current spending input by introducing a new opcode TAPLEAF_UPDATE_VERIFY. This can be achieved by doing one or more of the following:
Updating the internal public key
Pruning the Merkle path
Removing the currently executing script
Adding a new step to the end of the Merkle path
Specifically, TLUV accepts three kinds of input:
Specifying how to update the internal public key.
A way to specify a new step for the Merkle path.
Specifies whether to delete the current script and/or how many steps to prune the Merkle path.
The TLUV opcode calculates the updated scriptPubKey and verifies whether the output corresponding to the current input is spent on this scriptPubKey.
TLUV is inspired by the concept of CoinPool. Today, joint pools can be created with only pre-signed transactions, but if you want to implement permissionless exits, you need to create an exponentially larger number of signatures. TLUV allows permissionless exits without any pre-signatures. For example, a group of partners can use Taproot to build a shared UTXO to pool their funds together. They can use Taproot keys to transfer funds internally, or they can jointly sign to initiate payments externally. Individuals can exit the shared pool at any time, delete their payment path, while others can still complete the payment through the original path, and the individual's exit will not expose other information about others inside. Compared with non-pooled transactions, this model is more efficient and private.
The TLUV opcode implements partial spending restrictions by updating the original Taproot Trie, but it does not implement introspection of output amounts. Therefore, a new opcode, IN_OUT_AMOUNT, is also needed. This opcode pushes two items onto the stack: the UTXO amount of this input and the amount of the corresponding output, and then the person using TLUV needs to use mathematical operators to verify that the funds are appropriately retained in the updated scriptPubKey.
Introspection of output amounts adds complexity because amounts in satoshis require up to 51 bits to represent, but scripts only allow 32-bit math. This requires redefining the opcode behavior to upgrade the operators in the script or replacing IN_OUT_AMOUNT with SIGHASH_GROUP.
TLUV has the potential to be a solution for decentralized layer 2 pools, but its reliability in adjusting the Taproot Trie remains to be confirmed.
MATT
MATT (Merkleize All The Things) aims to achieve three goals: Merkleizing the state, Merkleizing the script, Merkleizing the performing, thereby realizing universal smart contracts.
Merkleizing the state: This involves building a Merkle Trie, where each leaf node represents the hash value of a state, and the Merkle Root reflects the overall state of the contract.
Merkleizing the script: This refers to using Tapscript to form MAST, where each leaf node represents a possible state transition path.
Merkleizing the performing: Merkleize the performing through cryptographic commitment and fraud challenge mechanisms. For any computational function, participants can compute it off-chain and then publish a commitment f(x)=y. If other participants discover that the wrong result f(x)=z, they can initiate a challenge. Arbitration is performed through binary search, similar to the principle of Optimistic Rollup.
Merkle-ized execution
To implement MATT, the Bitcoin script language needs to have the following features:
Force outputs to have a specific script (and its amount)
Append a piece of data to the output
Read data from the current input (or another input)
The second point is crucial: dynamic data means that the state can be computed from the input data provided by the consumer, because this allows the simulation of a state machine, being able to determine the next state and append data. The MATT scheme implements this via the OP_CHECKCONTRACTVERIFY (OP_CCV) opcode, which is a merger of the previously proposed OP_CHECKOUTPUTCONTRACTVERIFY and OP_CHECKINPUTCONTRACTVERIFY opcodes, using an additional flags parameter to specify the target of the operation.
Controlling output amounts: The most straightforward approach is direct introspection; however, output amounts are 64-bit numbers, requiring 64-bit arithmetic, which introduces significant complexity in Bitcoin Script. OP_CCV employs a delayed check approach like OP_VAULT, where the input amounts of all inputs to the same output in CCV are summed as a floor on that output amount. The delay is because this check occurs during the transaction, not during script evaluation of the inputs.
Given the ubiquity of fraud proofs, some variant of the MATT contract should be able to implement all types of smart contracts or layer-2 constructions, although additional requirements (such as funds locks and challenge period delays) need to be accurately evaluated; further research is needed to assess which applications can accept transactions. For example, using cryptographic commitments and fraud challenge mechanisms to simulate the OP_ZK_VERIFY function to implement trustless Rollups on Bitcoin.
In practice, this has already happened. Johan Torås Halseth implemented elftrace using the OP_CHECKCONTRACTVERIFY opcode in the MATT soft fork proposal, which enables any program that supports RISC-V compilation to be verified on the Bitcoin blockchain, allowing one party in the contract agreement to access funds through contract verification, thereby bridging Bitcoin's native verification.
CSFS (OP_CHECKSIGFROMSTACK)
From the introduction of the APO opcode, we know that OP_CHECKSIG (and its related operations) are responsible for assembling transactions, hash calculations, and verifying signatures. However, the messages verified by these operations are derived from opcode serialized transactions, and no other messages are allowed to be specified. In short, the role of OP_CHECKSIG (and its related operations) is to verify whether the UTXO spent as a transaction input is authorized by the signature holder through a signature mechanism, thereby protecting the security of Bitcoin.
CSFS, as the name implies, checks the Signature from the stack. The CSFS opcode receives three parameters from the stack: signature, message, and public key, and verifies the validity of the signature. This means that people can pass any message to the stack through witnesses and verify it through CSFS, thereby realizing some innovations in Bitcoin.
The flexibility of CSFS enables it to implement mechanisms such as payment signatures, authorization delegation, oracle contracts, double-spending protection guarantees, and more importantly, transaction introspection. The principle of transaction introspection using CSFS is very simple: if the transaction content used by OP_CHECKSIG is pushed to the stack by a witness, and OP_CSFS and OP_CHECKSIG are verified using the same public key and signature, and if both verifications succeed, then the arbitrary message content passed to OP_CSFS is the same as the serialized spend transaction (and other data) implicitly used by OP_CHECKSIG. We then have verified transaction data on the stack, which can be used to impose restrictions on spend transactions using other opcodes.
CSFS often appears together with OP_CAT, because OP_CAT can concatenate different fields of a transaction to complete the serialization, allowing for more precise selection of transaction fields needed for introspection. Without OP_CAT, the script cannot recompute the hash from data that can be checked individually, so all it can really do is check that the hash corresponds to a specific value, which means that the tokens can only be spent through a single specific transaction.
CSFS can implement opcodes like CLTV, CSV, CTV, APO, etc., making it a multi-purpose introspection opcode. Therefore, it also contributes to the scalability solution of Bitcoin Layer 2. The disadvantage is that it requires adding a full copy of the signed transaction on the stack, which may significantly increase the size of transactions introspected using CSFS. In contrast, single-purpose introspection opcodes like CLTV and CSV have little overhead, but adding each new special introspection opcode requires consensus changes.
TXHASH (OP_TXHASH)
OP_TXHASH is a simple introspection opcode that allows an operator to select a hash of a specific field and push it onto the stack. Specifically, OP_TXHASH pops a txhash flag from the stack, calculates a (tagged) txhash based on the flag, and then pushes the resulting hash back onto the stack.
Due to the similarities between TXHASH and CTV, there has been a lot of discussion in the community about the two.
TXHASH can be seen as a general upgrade of CTV, which provides more advanced transaction templates, allowing users to explicitly specify the parts of the spending transaction, solving many problems related to transaction fees. Unlike other Covenant opcodes, TXHASH does not require a copy of the necessary data in the witness, further reducing storage requirements; unlike CTV, TXHASH is not compatible with NOP and can only be implemented in tapscript; the combination of TXHASH and CSFS can be used as an alternative to CTV and APO.
From the perspective of building contracts, TXHASH is more conducive to creating "additive contracts", in which all parts of the transaction data you want to fix are pushed onto the stack, hashed together, and verify that the resulting hash matches the fixed value; CTV is more suitable for creating "subtractive contracts", in which all parts of the transaction data you want to keep free are pushed onto the stack. Then, using the rolling SHA256 opcode, the hashing starts from a fixed intermediate state that is committed to the prefix of the transaction hash data. The free part is hashed to this intermediate state.
The TxFieldSelector field defined in the TXHASH specification is expected to be extended to other opcodes, such as OP_TX.
The BIP related to TXHASH is currently in Draft status on GitHub and has not yet been assigned a number.
OP_CAT
OP_CAT is a mysterious opcode that was originally abandoned by Satoshi Nakamoto for security reasons, but has recently sparked heated discussions among Bitcoin core developers and even set off a Meme culture on the Internet. In the end, OP_CAT was approved under BIP-347 and was called the BIP proposal most likely to be passed in the near future.
In fact, the behavior of OP_CAT is very simple: it connects two elements from the stack. How does it implement the Covenant function?
In fact, the ability to connect two elements corresponds to a powerful cryptographic data structure: Merkle Trie. To build a Merkle Trie, only concatenation and hashing are required, and hash functions are provided in Bitcoin Script. Therefore, using OP_CAT, we can theoretically verify Merkle proofs in Bitcoin Script, which is one of the most common lightweight verification methods in blockchain technology.
As mentioned earlier, CSFS can implement the general Covenant scheme with the help of OP_CAT. In fact, even without CSFS, OP_CAT itself can also achieve transaction introspection using the structure of Schnorr signatures.
In a Schnorr signature, the message to be signed consists of the following fields:
These fields contain the main elements of the transaction. By placing them in scriptPubKey or Witness and using OP_CAT combined with OP_SHA256, we can construct a Schnorr signature and verify it with OP_CHECKSIG. If the verification passes, the stack will retain the verified transaction data, enabling transaction introspection. This allows us to extract and "check" various parts of a transaction, such as its input, output, destination address, or the amount of Bitcoin involved.
For specific cryptographic principles, you can refer to Andrew Poelstra's article "CAT and Schnorr Tricks".
In summary, OP_CAT’s versatility enables it to emulate almost any Covenant opcode. Many Covenant opcodes rely on OP_CAT’s functionality, which greatly improves its position in the merge list. In theory, relying solely on OP_CAT and existing Bitcoin opcodes, we have the potential to build a trust-minimized BTC ZK Rollup. Starknet, Chakra, and other ecosystem partners are actively promoting the realization of this goal.
Conclusion
As we explored various strategies for scaling Bitcoin and enhancing its programmability, it became clear that the path forward involves a fusion of native improvements, off-chain computations, and complex scripting capabilities.
Without a flexible base layer, it is impossible to build a more flexible second layer.
Off-chain computational expansion is the trend of the future, but Bitcoin’s programmability needs to break through to better support this scalability and become a truly global currency.
However, the nature of computation on Bitcoin is fundamentally different from that on Ethereum. Bitcoin only supports "verification" as a form of computation and cannot perform general computation, while Ethereum is computational in nature and verification is a byproduct of computation. This difference can be seen from one point: Ethereum charges a Gas Fee for transactions that cannot be executed, while Bitcoin does not.
Contracts are a form of smart contract based on verification rather than computation. Except for a few Satoshi fundamentalists, it seems that everyone thinks that contracts are a good choice to improve Bitcoin. However, the community is still arguing fiercely about which method should be used to implement contracts.
APO, OP_VAULT, and TLUV tend to be directly applied. Choosing these three methods can implement specific applications cheaper and more efficiently. Lightning Network enthusiasts will choose APO to implement LN-Symmetry; users who want to implement Vault are better off using OP_VAULT; and for building CoinPool, TLUV can provide better privacy and efficiency. OP_CAT and TXHASH are more functional, less likely to have security vulnerabilities, and can be combined with other opcodes to achieve more use cases, but the cost may be increased script complexity. CTV and CSFS adjust the way blockchain is processed, CTV implements delayed output, and CSFS implements delayed signature. MATT stands out with its optimistic execution and fraud proof strategies, using the Merkle Trie structure to implement general smart contracts, but the introspection function still requires new opcodes.
We see that the Bitcoin community is actively discussing the possibility of obtaining Covenants through soft forks. Starknet has officially announced its joining of the Bitcoin ecosystem and plans to implement settlement on the Bitcoin network within six months after the OP_CAT merger. Chakra will continue to pay attention to the latest developments in the Bitcoin ecosystem, promote the merger of the OP_CAT soft fork, and use the programmability brought by Covenants to build a safer and more efficient Bitcoin settlement layer.