One of our goals at the Informal Systems Cosmos Hub team is to be open and responsive to feedback from the community on our work. For this reason, we aim to get all important governance proposals in front of the community on the governance forum well in advance of voting so that we can modify them with feedback received.
We put up the draft Replicated Security proposal on the forum in mid-December 2022. Since then we have received a lot of feedback. This post deals with one piece of feedback in particular, and our modifications to RS in response to this feedback.
Replicated Security ensures that consumer chains have the same exact security and liveness as the Cosmos Hub itself. Part of this is that validators must be punished for consensus infractions on consumer chains. In the current RS implementation, this is done by having the consumer chain send a “slash packet” to the provider chain reporting on any misbehavior by a validator. Once this happens, the slash packets go into a queue known as the “slash throttle” so that validators representing more than a few percent of voting power cannot be slashed at once (more on that later). Once the packets get out of this queue, they trigger a punishment of the validator committing the infraction.
Just as on a normal Cosmos chain, if the infraction is double signing, the punishment is tombstoning, a permanent removal of the validator from the validator set, and a loss of 5% of the validator’s funds. This loss of funds is known as slashing. If the infraction is downtime, the punishment is removal from the validator set for 10 minutes and slashing of only 0.01% of the validator’s funds.
These punishments occur on the Cosmos Hub, so misbehavior on a consumer chain impacts all nodes (Hub and consumer chain) run by that validator.
We received feedback from some validators and community members that it would be too risky to slash based solely on information transmitted from the consumer chain. The concern is that some malicious code on a consumer chain could send fake slash packets and slash a validator that had not committed any infractions.
This is a concern that we have as well, and we’ve built mitigations into the design already. The slash throttle that I mentioned earlier prevents too many validators from being slashed at once, a scenario which could harm the Cosmos Hub. We also recommend that all consumer chains be fully audited before being approved by governance, to avoid loss of user funds through normal vulnerabilities, as well as malicious code crafted to send fraudulent slash packets.
We also have updates to replicated security in the works which will allow the Cosmos Hub to verify double signing and downtime evidence on its own. However, this is not trivial, and will require more work. This is known as the “untrusted consumer chain protocol”.
In response to these concerns, we have decided to curtail the abilities of consumer chain code to slash validators on the Hub, at least in the first release of Replicated Security. This is temporary, until we release the untrusted consumer chain protocol.
Instead, instances of double signing on consumer chains will be logged. We are collaborating with the team at Ignite on a new type of governance proposal which can be used to slash and tombstone validators who equivocate on consumer chains. Instead of slashing and tombstoning validators who double sign on consumer chains immediately, it will go through a governance vote first. This modification will add an extra layer of safety, while still punishing validators who violate the rules of consensus. Double signing is extremely rare in practice.
They will, however, still be jailed for downtime. We have determined that jailing is essential to provide liveness guarantees for consumer chains, and that the governance process is too slow to provide the same guarantees if every single instance of downtime must pass through a 2 week voting process.
In the scenario where malicious code on a consumer chain tries to jail every single Cosmos Hub validator at once, the throttling code mentioned above will take several days to jail them all. During this time every validator who is jailed will unjail themselves quickly (most have scripts to do this automatically already), resulting in the attack not taking more than a few percent of validation power out of the set at once. Most likely it won’t get very far at all though, since the offending consumer can be removed with an emergency upgrade.