Click to open previous draft discussion post
Partial Set Security is a reimagining of Opt-in Security (thanks to @effortcapital for the idea), which would allow only a subset of Hub validators to run a physical node for each consumer chain, while still allowing each consumer chain to be secured by the full stake of the Hub. This would work by allowing validators who were not running a physical node to delegate to a validator running a physical node. The implementation of this feature should be straightforward (while still being a lot of work), but before it is started, several questions need to be answered.
We’re posting this to get all of the questions we have identified into one place where they can be discussed, as well as soliciting the community to ask any questions that we may have missed.
The design of Partial Set Security revolves around the idea that if the entire stake of a provider chain does not secure a consumer chain, it may not be secure against attacks that validators cannot be slashed for. These types of attacks may be incorrect execution in the absence of fraud or validity proofs, and liveness attacks. They are stopped in the single chain case by token toxicity- the idea that validators do not carry out these attacks because the chain’s staking token would crash in price if they did. Token toxicity should also hold for shared security techniques where the entire stake of the provider secures the consumer, such as Replicated Security. I’ve written about token toxicity more here.
Partial Set Security is intended to preserve token toxicity by allowing the entire stake of the provider to secure each consumer, even if every single validator does not run a physical node. But token toxicity is not completely proven. It is a good explanation for the continuing operation and security of most blockchains, but it is very hard to quantify. For example: does token toxicity hold if 99% of the provider’s stake is staked on a consumer? It very likely does. Does token toxicity hold if 0.1% of the provider’s stake is staked on the consumer? It very likely does not. But where is the line?
We’d love to get some more thoughts about token toxicity, and whether it is a good framework for shared security questions.
Are there any issues with validator delegation that are not present with normal delegation? It’s not even really clear that normal delegation is a good idea, but it seems to be working pretty well so far. Could validator delegation have unforeseen interactions?
Partial Set Security addresses the main criticism of Replicated Security- the high cost of making every validator on the Hub run every consumer chain. But how much of the cost of validation is node operation and how much is the risk of slashing? This has not been quantified. Will validators balk at being forced to delegate to another validator? Can proportional slashing (a protocol where accidental double signs which only affect a single validator incur far lower slashes) mitigate these concerns?
If a validator delegates instead of running a physical node, its delegators must be slashed in the case that the validator it has delegated to commits an offense and is slashed. But should the delegating validator itself be tombstoned? How is downtime handled?
It is obvious that a validator which delegates its power should get a lower commission than a validator which runs a physical node. But there are several possible approaches.
The simplest approach is to make it so that validators which delegate to other validators receive no commission for that consumer chain, with the validator running the physical node receiving it all. The delegating validator’s reward is simply not being penalized for downtime.
However, the delegating validator does incur some risks. If they are tombstoned for offenses, this is an obvious risk. Even if they are not tombstoned but their delegators are slashed, it is a reputational risk. For this reason, it might be appropriate to allow for a split of the commission between the physical node running and the delegating validators. But who sets the overall commission, and the split? And is this really necessary? After all, a validator who is concerned about these risks can just run a physical node.
Maybe it is necessary to allow validators to set a different commission per consumer chain. After all, some chains may be more expensive to run than others.
This is probably the meatiest question about this design: how to ensure that a minimum amount of validation power (probably ⅓ is good) runs physical nodes? In the Partial Set Security paper, we lay out 4 alternatives. The first two, stopping consumer chains that go below this threshold immediately, and forcing the last validators running a consumer chain to continue running it to avoid going below the threshold can probably be dismissed out of hand. This leaves two viable options:
In this option, the top ⅓ of validators would be required to run all consumer chains. This would ensure that all consumer chains had the minimum amount of physical nodes needed for safety. Validators outside of the top ⅓ could still run physical nodes, stopping and starting them as consumer chains became more and less profitable. This is great because it is extremely simple, but it also has downsides. It could act as a centralization vector, as validators might prefer to delegate to the top ⅓ to avoid having to change their delegations because they would know that the top ⅓ would always run physical nodes. Conversely, it could also act as a Sybil incentive, since large validators might break up their stake to stay out of the top ⅓ and benefit from the optionality of being able to stop and start consumer chains. It might also lead to strange performance characteristics as the number of physical nodes in a consumer chain’s set might fluctuate with the chain’s token price and profitability,
In this variant, validators would need to commit to running a physical node for a consumer chain for a certain length of time, maybe 6 months. This would be more complicated, but might have better incentives for validators. We would need to figure out how the timing and mechanism of this would work. I can think of at least 3 different ways:
- Every 6 months, validators decide which consumer chains they will run physical nodes of for the next 6 months. All consumer chains share the same cycle.
- For a given consumer chain, every 6 months all validators decide whether they want to run physical nodes for the next 6 months. Consumer chains do not share the same cycle.
- Validators can start running on a consumer chain any time, but once they start, they must continue for 6 months.
After lots of discussion with validators, consumer chains, and community members (most of it occurring outside of this thread), we’ve made some big updates and simplifications to how partial set security will work. The biggest change is that we have gotten rid of validator delegation. Validators were just not willing to do it. New discussion draft is below:
The next major update of Interchain Security will introduce Partial Set Security. Partial Set Security increases the flexibility available to consumer chains, allowing them to dial in the right mix of economics, validator set agility, and security. Additionally, it will let validators choose whether or not they want to validate on any given consumer chain in most cases.
These changes will allow consumer chains to launch much more quickly and reduce the workload on Hub validators while allowing consumer chains to choose how much security they need.
Currently, consumer chains are created with a governance proposal, and then get the entire security of the Hub’s validator set. This is a simple system, with guaranteed high security, but it lacks flexibility. Running more chains puts more work on validators, without necessarily increasing their rewards by much. This is not a problem for large validators who earn millions in commission, but it is a strain on smaller validators.
It also puts a lot of pressure on consumer chains generate enough in rewards to pay the validators for their work. It does not allow consumer chains to select the level of security that they need, or to scale security as they grow their TVL.
Finally, the need to get a governance proposal to pass to create a consumer chain limits the rate of growth of ICS, due to the friction involved.
Partial Set Security could enable consumer chains to be launched permissionlessly, without a governance proposal. A consumer chain can be added with a simple transaction, as long as the chain ID is not currently in use (details on anti-squatting measures to follow).
Once a consumer chain has been created in this way, validators can opt in to validate it if they want to. We expect many consumer chains will be launched this way, and will be able to grow their validator sets organically.
Validators (and their delegators) are only entitled to rewards from a consumer chain if they are opted in. Some validators may only opt in to consumer chains with attractive rewards, and some may still opt into every consumer chain so that their delegators never have to worry about missing out.
It may be advantageous to use the existing governance interface to launch opt-in consumer chains. The idea here is that only validators who voted YES would be opted in to run a consumer chain. Validators could vote ABSTAIN to signal that they don’t want to run a consumer chain themselves, but don’t want to stop other validators from running it. Governance votes for opt-in consumer chains would use a much lower quorum threshold, so it would not be hard for such a vote to pass. The idea here would not be to regulate which chains could join (although the community might show up to vote NO on outright scams), but just to use the existing governance interface on the Cosmos Hub that validators are familiar with.
More writing on this option here.
While most consumer chains will likely launch with opt-in, some high profile consumers may want a guaranteed level of security. This is what top-n provides.
The top-n for a consumer chain specifies what percentage of the Hub’s security that consumer chain would like to guarantee. When the consumer chain starts, the top n percent of the Hub’s validator set will be obligated to run the consumer chain.
Even though validators outside of that top-n percent are not obligated to run the consumer chain, they can still choose to run it if they want. Many probably still will so that they don’t miss out on rewards, and as a selling point to consumers.
Let’s look at a few scenarios:
- A top-n of 100% is equivalent to Replicated Security.
- With a top-n of 65%, a consumer chain gets more than half of the Hub’s economic security, but with only 23 validators.
- A top-n below 33% is not possible. This is important for incentive reasons, so that the top validators can never be forced to run a consumer chain they don’t want (with 33% they could veto the proposal).
On an opt-in consumer chain, when a validator is jailed, they are simply opted out of that consumer chain automatically, and cannot opt back in for a time period. There is no jailing on the Hub.
If a consumer is using top-n, then the validators in their top-n set can still be jailed on the Hub for downtime. This mechanism provides top-n’s guaranteed level of security.
Each consumer chain can also set a cap on the power that any individual validator can have on their chain. This has several purposes.
- It can prevent a large validator entering a small consumer chain from immediately controlling the chain.
- It can prevent downtime from a few large validators on a consumer halting the chain.
It’s best not to use this too heavily to avoid distorting the PoS system, but a reasonable cap can have a beneficial effect on the chain’s Nakamoto coefficient.
We’ll release some analysis on different top-n and cap scenarios soon.
Shared security systems where a partial stake of the provider chain secures consumer chains can be vulnerable to a security challenge known as the subset problem. To put it simply, this is a situation where a malicious subset of the provider chain’s validator set control a consumer chain and attack it in a way that they cannot be slashed for. This is not an issue for systems where the whole stake of the provider secures consumer chains, such as Replicated Security. The subset problem is a potential issue for Partial Set Security.
Fraud votes are a way to mitigate the subset problem. A fraud vote is a way for Cosmos Hub governance to slash validators that misbehave on consumer chains. It is simply a governance proposal that will slash a validator(s) if it passes. It is to be used strictly for slashing validators who commit incorrect execution on a consumer chain, for example taking all of the money out of someone’s wallet.
Fraud proofs and zk validity proofs solve this problem without a vote, so once these technologies are working in Cosmos, we will be able to remove fraud votes.
This is a powerful tool, but it will have some limits that prevent a proposal from even being created if certain conditions are not met:
- Validators cannot be slashed for an offense on a consumer chain they are not validating.
- Fraud votes do not apply to consumer chains with more than 1/3 of the Hub’s stake.
- Fraud votes do not apply to consumer chains using top-n, since these cannot have less than 1/3 of the Hub’s stake.