We’re putting this on the forum to start a discussion about whether it makes sense to create a new type of governance proposal on the Hub which would allow for votes to slash validators or delegators for attacking consumer chains with incorrect execution.
Several things are important to note:
- This new proposal type would only apply to validators who had opted into an Opt-in consumer chain, or delegators who had opted into a Mesh consumer chain.
- It is a temporary measure until fraud proof technology and surrounding systems are mature enough.
- This type of proposal is to be used only to slash validators for incorrect execution. It is not to be used to slash validators for downtime on a consumer chain (even if they maliciously cause it to halt), or any other offenses. If we implement this type of proposal we should add text to all governance proposals created stating the above.
- Even if the Cosmos Hub does not use this proposal type to enable Mesh Security, it is likely that other Mesh provider chains will.
Unlike Replicated Security, both Opt-in Security and Mesh Security allow a subset of the stakers on a chain to contribute to securing another chain. While the ability to secure other chains is a very powerful tool, it can also have some pitfalls. It must be possible to punish these stakers for attacking the chains they secure.
What stops attacks on blockchains?
Currently, proof of stake protocols punish validators for double signing. Double signing is dangerous for a chain when a ⅓ cartel of stakers sign two conflicting blocks at the same block height. As an example, this could be used in an attack where one victim is led to believe that they have received some money and another victim is led to believe that they have received the same money. Each history is valid on its own, but together they are invalid.
This is very good to defend against on its own, but it is not sufficient to secure a chain. The other form of attack that must be defended against is known as an incorrect execution attack. In this attack, a ⅓ cartel of stakers sign a block that simply breaks the rules of the chain. For example, a chain might have a rule that says “no tokens can be transferred from one account to another without a signature from the sending account”. In an incorrect execution attack, a cartel of stakers might just sign a block that transfers everyone’s tokens into their own wallets. People running full nodes would know that something was wrong, but light clients such as those used in IBC bridges would have no idea.
So what stops cartels of stakers from performing an incorrect execution attack today? There are many factors, but one of the most important is known as token toxicity. If a cartel of stakers on the Cosmos Hub were to perform this attack to empty all of the Hub’s IBC bridges, it would crash the value of Atom, since the Hub’s security would have been shown to be worthless.
This dynamic holds for Replicated Security. RS consumer chains are solely secured by their provider chain, in this example the Cosmos Hub. So from a token toxicity standpoint, a ⅓ cartel of Atom stakers compromising a consumer chain is exactly like this cartel compromising the Hub itself. Either way, such an attack would make the security of the Hub worthless, and this acts as an incentive for validators not to perform these attacks.
Why token toxicity doesn’t work in Opt-in and Mesh Security
With Opt-in and Mesh Security, it’s not so clear that token toxicity will keep consumer chains safe. This is because the responsibility for an attack could be much more diffuse. Let’s look at some examples.
Imagine an Opt-in Security consumer chain has a $20m TVL, and is secured by Cosmos Hub validators with $70m in stake. This is theoretically secure against double signing. The ⅓ cartel which could attack this consumer chain has $23.3m staked and slashable[1], so stealing the $20m doesn’t make sense.
It is, however, not secure against incorrect execution. If there is no way to slash for incorrect execution on a consumer chain, then we rely on token toxicity. But this $70m of stake is a small fraction of the total stake on the Cosmos Hub. It’s not clear that the malfeasance of this small fraction of validators would crash the Atom price. After all, the vast majority of Atom stakers in this scenario are honest, and the attack did not affect the Hub itself or any other consumer chains. Of course, it’s impossible to predict what price movements would happen, but the case for a complete loss of Atom’s value in this scenario is a lot weaker. I don’t think we can rely on token toxicity.
Let’s look at a similar scenario with Mesh Security. I would argue that the responsibility is even more diffuse in this case. Imagine a Mesh Security chain with $20m in TVL whose total stake is $70m, with $50m coming from a variety of provider chains. An attacker with $23.3m could deploy this capital across the provider chains to gain control of a ⅓ cartel of validators on the consumer chain. They could then perform the same attack.
In this scenario, it’s very unlikely that anyone would even think about blaming the provider chains. Since there are several of them, the attacker may only control a very small portion of the stake on each.
Slashing for incorrect execution
Both of these examples involve stakers with power over a consumer chain committing incorrect execution and not being slashed for it. The chain’s security must therefore rely on token toxicity, which for Opt-in and Mesh Security probably does not work.
The way to solve this is to avoid relying on token toxicity. If the attacker in these examples could be slashed for incorrect execution, this would solve the problem. Slashing for double signing is relatively trivial. In principle, it simply requires looking for two signatures on different blocks at the same height from the same validator. Slashing for incorrect execution is less trivial, because it requires a concept of what the “correct” execution is, which depends on the state and the execution dynamics. This requires something called “fraud proofs”.
Fraud proofs
Fraud proofs are an ongoing area of research and primarily intended for use with roll-ups, which are similar to consumer chains. A fraud proof allows you to prove that a validator signed an incorrect state transition, without needing to run a full node for the chain involved. A provider chain could accept proof that an incorrect state transition was signed by a validator and slash the stakers involved. This would enable Opt-in and Mesh security to function as intended.
However, fraud proof implementations are perpetually six months from completion. It’s a very hard technical problem. Currently there is no fraud proof framework that will work for Cosmos chains using Opt-in or Mesh security.
There are also a lot of problems that need to be solved around the edges. For example, in a naive implementation, a validator who accidentally ran the wrong binary during an upgrade might be slashed for fraud. Technically, it could be proved that they had signed an incorrect state transition if they were accidentally running the wrong binary. There needs to be a framework to handle these scenarios safely.
Additionally, fraud proofs (and ZK validity proofs) currently require a system called a “DA layer”, which is essentially another blockchain where all transactions must be posted for the fraud or validity proof to even work. If this DA layer is compromised, it may become impossible to slash for incorrect execution. In some sense, security is provided by the DA layer. What security is provided by Opt-in or Mesh security in this scenario needs to be examined more closely.
All of these challenges can and will be solved, but as the Cosmos community, we need to ask ourselves if we want to wait for that.
Fraud votes
To be able to launch Opt-in and Mesh security while work continues on cutting-edge fraud proof research, we can turn to a simple mechanism: the fraud vote. This would be a type of governance proposal. If (and only if) a staker or validator had opted in to stake on a Opt-in or Mesh consumer chain, they would become eligible to be slashed by this governance proposal. If an attack involving incorrect execution happened, proof could be submitted in this proposal. Voters could then sync up full nodes for the chain in question and verify the incorrect execution for themselves. This is certainly not as elegant as a fully automatic and optimized fraud proving system, but it should have much the same effect.
This also allows us to capitalize on an advantage that Cosmos has over rivals. Ethereum-based shared security systems such as Eigenlayer are under development. Cosmos has the lead for now, but we need to be prepared for the much larger stake on Ethereum to enter the market for shared security. However, Ethereum lacks a governance system, and it would not be possible for it to use an analogous mechanism. Perhaps a contract like Eigenlayer could use some sort of multisig or council or something, but this is clearly very centralized. They need to rely on fraud proofs. The possibility of fraud votes on Cosmos allows us to innovate and get out ahead while fraud proofs are still under development.
The YOLO scenario (hard fork slashing)
It is of course possible to run Opt-in or Mesh Security without fraud proofs or fraud votes. In this case, if validators or delegators were to cause incorrect execution, the only option for slashing them would be for the provider chain to hard fork to a version of the state where the offenders were slashed. This would essentially be the same thing as this proposal does with governance, but done with a potentially frantic back-channel hard fork. It could work, but it seems much better for something like this to be done in a controlled and intentional manner with a fraud vote as proposed here.
Arguments against fraud votes
Efficiency/spam argument
The core of fraud proof technology is the ability for the fraud proving framework to prove fraud without needing to sync up all the data that a full node needs. Our fraud votes would not have this ability, thus voters would need to sync up full nodes. This would be somewhat expensive. An attack utilizing this shortcoming could be as follows:
- The attacker would spam the provider chain with fraud vote proposals.
- Voters on the provider chain would become tired of the expense of syncing up full nodes to verify them.
- The attacker would then commit an actual instance of incorrect execution on a consumer chain, and nobody would bother to vote on the fraud vote proposal.
To prevent this, it would be good to look into raising the requirements for deposit on fraud votes and maybe increase the range of scenarios under which such a deposit is burned. Also, it seems unlikely that this would be a huge problem in real life. Even if the provider chain was flooded with spam fraud vote proposals, an actual instance of theft from a consumer chain would be a big event and would have victims. Human communication outside of the blockchain could lead voters to the real fraud vote proposal, which they could then verify and vote on.
This may break down in advanced scenarios. In a world with thousands or millions of consumer chains, or one where consumer chains are lightweight and ephemeral, voters might not care enough to vote on any fraud vote proposals. But for the use cases that Opt-in and Mesh Security are currently being built for, with hundreds of chains securing relatively large amounts of value, fraud votes should work just fine. Once we reach these more advanced scenarios in the future, real fraud proofs should be ready.
Contentiousness argument
Another potential argument against fraud votes is the fact that contentious scenarios may arise. The example I have used so far is trivial in that it is obvious to everyone that the validators committing incorrect execution have done it with the intent to steal. However, imagine the following scenario: \
- An Opt-in or Mesh consumer chain experiences an event that some consider an attack on the protocol and others consider to be an example of the attack’s “victims” trading badly and trying to socialize their losses by calling it an attack. Things like this have happened in the past involving flash loans, hostile DAO takeovers, and even the Luna depeg. Things like this happen frequently in the traditional finance world as well.
- Validators on the consumer chain apply an emergency upgrade to stop the “attack”.
- In the aftermath, those who feel that it wasn’t actually an attack submit a fraud vote proposal to the Cosmos Hub. Technically, it can be argued that the validators applying the upgrade committed incorrect execution because once they were running the emergency upgrade, they were no longer following the protocol originally specified by the consumer chain when it launched or in its last governance upgrade.
- Now Cosmos Hub governance needs to answer this question.
Vitalik Buterin writes about wanting to avoid this type of scenario on Ethereum. It should be noted that in the case of real fraud proofs, the validators applying the emergency upgrade would be slashed automatically. This is one of the issues which makes real fraud proofs tricky. For example, Eigenlayer, a system similar to Interchain Security which is intended to be used with real fraud proofs faces this issue. They have built in a backdoor via a multisig “comprised of prominent members of the Ethereum and EigenLayer community” as a temporary solution (section 3.4.2).
This highlights an important issue: it’s very hard to avoid emergency upgrades, hardforks, and contentious events in real life. Ethereum may try, and it’s a lofty goal, but it’s not always possible. We saw this at the beginning of Ethereum with The DAO hack and hardfork. Even today, many Ethereum projects, like Eigenlayer, include a backdoor controlled by “prominent community members”. Cosmos should lean into its advantages, one of which is a robust and frequently used governance system built into the code and culture at a low level. In the case of the fraud vote system proposed here, this will allow us to increase our lead in the shared security space.
[1] In reality, on most Cosmos chains, the actual double signing slashing percentage is set to 5%, which means that under this analysis the security is only 5% of the total number. This should probably be changed, but that’s outside the scope of this example.
Protocol design notes
In the Cosmos governance system, votes take a fixed amount of time and only have an effect after the vote is concluded. If a fraud vote proposal is submitted to the provider too late, then the stakers in question will be fully unbonded before the voting period ends and will be able to escape slashing. With standard settings of a 2 week voting period and a 3 week unbonding period, this means that there is only one week during which a fraud vote can be submitted and successfully slash the stakers.
pu: provider unbonding period
pv: provider voting period
epu: effective provider unbonding period
pu - pv = epu
If we reduce the voting period for fraud votes to a shorter amount of time, perhaps one week, then it will provide a longer amount of time during which the fraud vote proposal can be created. It is probably not a good idea to reduce the voting period to less than one week given the seriousness of a fraud vote.