Hypha/Informal Oversight Kickoff Meeting

Introduction

As part of Proposal #839 (Fund 2024 Hub development by Informal Systems and Hypha Worker Co-op), Informal and Hypha will be meeting with their 2024 Oversight Committee each quarter to go over plans and reflections for the quarter.

The 2024 Oversight Committee consists of:

  • Stride contributor: Aidan Salzmann
  • Neutron contributor: Avril Dutheil
  • Polkachu representative: Polkachu
  • Jim Parillo at Figment Capital
  • Shane Vitarana at Stargaze
  • James Hinck, Product Manager at Circle

This is a summary of the Hypha/Informal 2024 Oversight committee meeting which took place on January 8th 2024. The oversight committee asked many questions about the plan, which we have paraphrased here. All members of the oversight committee have reviewed and approved this document prior to posting.

The document is organized into 3 sections- Informal’s Q1 plan, Hypha’s Q1 plan, and then a summarized transcript about the discussion that took place. We’re still figuring out the format for report on this so let us know how you like it.

Navigation

Informal Q1 2024 plan

Maintenance

Gaia v15 with SDK 0.47

During Q1 we will release Gaia v15 featuring SDK 0.47, which brings many enhancements and improvements to the Cosmos SDK blockchain platform that the Cosmos Hub is built on.

Maintain a regular release cycle (v16)

We work towards two releases per quarter, since regular releases are a healthy software engineering practice. Gaia v15 is updating a core dependency (Cosmos SDK), and v16 will enable new features on top of that dependency.

Skip SDK and fee market

Skip has created a fee market which will dynamically adjust fees depending on blockspace usage, similar to what exists on Ethereum and other major blockchain platforms. This will allow for low fees under normal usage, raise fees to price in congestion during busy times, and make DOS attacks non-viable. We will work to integrate this into the Cosmos Hub.

ICA controller

IBC 7 (enabled by the upgrade to SDK 0.47) enables the Interchain Accounts controller module, which will let Cosmos Hub governance and other modules control accounts on other chains. This will be useful for Neutron. We will work to get this integrated into Gaia for v16.

Future SDK prep

We are upgrading ICS to SDK 0.50, and will work on the LSM as well in preparation for a future release of the Cosmos Hub using SDK 0.50.

Issue resolution

We receive many Github issues and will endeavor to respond to all of them, either by fixing legitimate code issues, offering guidance or direction to documentation, or closing issues as wontfix.

Testing

Expand MBT functionality

Model Based Testing is a testing methodology that uses formal models to generate a huge number of tests. Due to our work on CometMock, we are now able to use this methodology to test the ICS codebase. Currently our model does not cover several areas of functionality including key assignment and reward distribution. We will add this coverage, which will help us develop features such as epochs and partial set security.

Restructure e2e tests

Our current end to end tests provide great coverage, but are somewhat inflexible and built around a core bash script. We will restructure them to improve:

  • Testing multiple different versions of ICS against each other
  • Testing multiple consumer chains at once
  • Producing better output

Research and Development

Acknowledgements: Thanks to Quoc Le of Stakefish for extensive input on the Partial Set Security design whose implementation will make up a large part of our R&D work for Q1.

Partial Set Security

Partial Set Security is a proposed modification to ICS to allow more flexibility in how much power consumer chains get from the validator set, allows validators more choice in which consumer chains to validate, and can allow for consumer chains to be launched permissionlessly. We are following the CHIPs process in doing this work.

Specification/Spike phase

We are going to combine the CHIPs Specification and Spike phases into an implementation plan which will be posted on the forum and will guide our work. This is because it is a modification to existing software and does not have as much need for a full prototype. The implementation plan will go into a high level of detail on exactly which code in ICS will be modified.

Signaling phase

After the implementation plan is up, we will create a signaling proposal to get community approval on this new direction for ICS. The proposal will focus on what the feature means for end users, consumer chains, and validators.

MVP implementation/testing phase

We will work towards implementing most core features of Partial Set Security during Q1. We do not anticipate completing it fully or getting it into production, but we should be able to demonstrate all of the major aspects of the feature in tests.

Atomic IBC

Atomic IBC is a plan to bring an atomic composability platform to the Cosmos Hub, allowing appchains on it to compose seamlessly why scaling in parallel. A plan for an initial MVP is currently in the discussion phase. This is Megablocks, a technique to build atomic composability on top of Comet. We are not 100% sure that this is the right way to go (we could do something more akin to a shared sequencer), but it is a good way to get started exploring the design space of state machine composition.

Specification and Spike phase

We will work towards a prototype ABCI shim for megablocks, while also specifying the work we are doing.

Comparative technology report

Many different platforms in the blockchain space such as Espresso, Solana, and heterogenous Paxos have the same goal of enabling horizontal scaling along with atomic composability. We will study these technologies and write a report comparing them to our approach.

ICS Epochs

Epochs will allow ICS to use far fewer valset update packets, perhaps one per day or one per hour. This will reduce costs for consumer chains and improve scalability.

IBC routing

The goal of our work on IBC routing is that when the feature lands in IBC, the Cosmos Hub is seen as the default choice.

BD for PSS

More activity to get slightly smaller projects than with RS, because the stakes can be lower. Work towards governance based liquidity injection plan to induce projects to join (”ATOM wars”).

Marketing & Content

We will produce a regular cadence of updates and thought leadership on our blog, the Cosmos Hub forum, and Twitter to keep the community informed and engaged with our work.

Expected challenges

  • LSM upgrade to SDK 0.50
  • Possible interactions between Skip Block SDK and the globalfee module.
  • ICA controller- it doesn’t do much without a use case. Need to integrate it with whatever the intended use case is.
  • Expanding MBT functionality- large formal models can become hard to maintain.
  • Restructure end to end tests- it’s very easy to overengineer e2e frameworks in Cosmos
  • Partial set security
    • New features can always unexpectedly take longer than expected
    • PSS involves fraud votes which can be tricky (818). It’s possible that the community will object to this part of PSS, but we do not believe it is secure without them.
    • Top-N provides very good security but still has challenges such as the poor uptime of larger validators.
    • Pure Opt-in (not Top-N) does not have the same properties as RS (it cannot be said that the security is the exact same as code running directly on the Hub).
  • Atomic IBC- the software development part of this is very exploratory and we cannot guarantee any particular outcome other than learning.
  • IBC routing
    • It’s unclear how much benefit it brings, since it is a gas cost reduction for client updates which are already quite cheap. However, if people are using it, it should be going through the Hub.
    • The IBC Go team is not sure when they will upstream it. We are encouraging them to do it sooner rather than later, but we do not control their schedule.
  • Unexpected emergencies, issues, bugs, drama- there are always interesting things happening in the Atom community and it is likely that something will come up at some point and take away some amount of time that we wanted to spend on the items above.

Community input

We strive to solicit and integrate community feedback on our work.

Partial Set Security design

In finalizing the design for Partial Set Security, we followed the CHIPs process and created a discussion post on the forum. We also did several twitter threads to bring awareness and solicit feedback.

We also worked with community members individually to refine the design, and consulted several existing and prospective consumer chains and validators on the design.

This feedback resulted in us simplifying the design quite a bit, and making it match the needs of consumer chains and validators better.

CHIPs submissions

In addition to the CHIPs discussion phase post on Partial Set Security, we also submitted discussion phase proposals for Minimum Commission as a Function of Voting Power, Optimizing ICS reward distribution with per-chain commission, and Atomic IBC Megablocks. While we were able to get some input on these ideas through the discussion phases proposals, we’ll need to look at ways to get more eyes on CHIPs proposals in the future to gather more input.

Comet performance questions

There have been concerns in the community that Comet’s performance may suffer under high load. In collaboration with Hypha, we’ve been analyzing Comet’s performance by stress testing it in an effort to reproduce and diagnose any issues which may be present. A full report will be forthcoming.


Hypha Q1 2024 plan

Headcount and capacity expectations for Q1

Currently, Hypha has the following headcount:

  • 1 FTE program manager
  • 1 FTE software engineer
  • 0.6 FTE devops engineer
  • 0.05 FTE technical advisor

By the end of Q1, we expect to have hired another full-time software engineer for our team. In the early months though, we are at the lower end of our headcount range and this is reflected in our goals for the quarter.

Testing

Upgrade testing

All new Gaia releases go through periodic testing that begins after the relevant upgrade handler is added to the main branch, and continues through release candidates up until the release is cut. Automated upgrade tests are set up to run two scenarios, both of which follow the software upgrade proposal process with the governance module and run a baseline set of tests, such as launching a set of consumer chains using different versions of Interchain Security:

  • Fresh genesis
    • The faster of the two scenarios
    • A genesis file is initialized with three validators and the chain starts at height 1
    • All new features are tested
  • Stateful genesis
    • A genesis file is regularly exported from a Cosmos Hub node and modified to provide a single validator with a majority voting power
    • The chain starts at a recent mainnet height with the relevant state in place
    • New features are tested wherever possible

In addition, local testnets are deployed to run verification tests as circumstances require.

Gaia releases typically go through several rounds of upgrade testing as new code is pushed and release candidates are cut.

Feature and regression testing

Each new version requires setting up feature-specific tests on top of the upgrade tests described above. Below is a summary of planned features and dependencies to be tested in upcoming releases:

v15

  • Bump SDK to 0.47 and IBC to v7
    • Verify all tests continue to pass and use the existing workflows to highlight all, if any, breaking changes, especially for the globalfee, PFM, and liquid staking modules
  • ICS 3.3.0 compatibility
  • Provider module queries
  • Proposal 826 outcome - implement minimum commission migration (existing validators’ commission cannot be <5%)
  • Proposal 104 outcome - clawback funds from Notional

v16 (tbd)

  • Skip SDK and fee module
  • ICA controller module

Partial Set Security test plan

We’ll be working alongside Informal to develop a thorough test plan for Partial Set Security as it goes through the CHIPs process.

Testnet events and activity

We’ve kept up a regular cadence of Testnet Wednesdays so long as they don’t interfere with mainnet events. We’ll keep this up throughout Q1 with regular upgrade and launch events depending on consumer chain schedules.

We’re also looking to run at least one game day per month in which we run simulations on the RS Testnet and see how the network performs under stress. This sort of event can lead to tuning parameters on mainnet (such as during our investigation into p2p storms) and a better understanding of how mainnet might be affected by different kinds of real world events (e.g., a geographic region’s nodes all going down, a set of peers struggling, a subset of the network having their state corrupted or apphashing).

Regular testnet events are also a major component of the next item, the Testnet Incentive Program.

Testnet Incentive Program (TIP)

Hypha is stewarding a pilot program to incentivize testnet participation, funded by a 50k USD grant from AADAO. This grant pays validators $100-500 USD per month if they meet the following criteria:

  1. Be part of the active set on the Hub
  2. Validate all mainnet consumer chains
  3. Remain unjailed on the provider testnet chain for the full month
  4. Participate in all testnet events
  5. Run the same infrastructure on testnet and mainnet

These criteria all serve the goal of ensuring that the Hub has a production-grade testing environment where we can investigate issues under the same conditions as mainnet.

In Q1, we plan on:

  • Paying out the entirety of the pilot program grant
  • Identifying reliable validators and decentralizing the stake on the testnet (Hypha currently controls around 57% of the stake) to more closely mimic mainnet
  • Reporting on our findings and soliciting feedback for a longer-term program
  • Applying for an ongoing grant to fund incentives
  • Increase volume of activity on the testnets (e.g., more testnet events, game days, testnet users)

Gaia maintenance

Hub upgrades

Hypha is on-call for all Hub upgrades, planned or emergency. All gaia releases go through our regular three-phase testnet process (local, release, replicated security) before hitting mainnet.

In Q1, we’ll go through this for v15 and v16, on target with having 2 releases per quarter. Details specific to these two upgrades can be found in the testing section.

Hypha also supports Hub upgrades on the communications side, helping write instructions and informational content for validators ahead of planned upgrades and then spread information ahead of the upgrade event.

Hub security support

We will do testing for Hub-related security issues triaged by Amulet. This involves closely collaborating with the Amulet and Informal teams to evaluate the severity and impact for a given issue and developing a plan to:

  • Verify the issue is a credible security threat
  • Replicate in a local testnet
  • Investigate mitigation and/or fix options
  • Test solutions

Once a plan is agreed upon, Hypha will deploy infrastructure to run the required local tests and distribute private reports to the relevant teams after all findings are collected.

We will continue to follow this process for new and current issues, such as the ongoing investigation into mempool issues.

Mempool testing

We began an investigation into mempool parameters after several waves of IBC transactions were sent to the Replicated Security testnet in the last quarter of 2023.

We have set up a test platform that allows us to evaluate network and node performance by adjusting different parameters, such as mempool and block sizes, in a network that can scale to tens of nodes. We are collecting data on how memory use increases under different conditions (e.g. varying transaction sizes) as the state grows.

We have been working with the CometBFT team to characterize and evaluate the performance of different database backends using CometBFT deployments.

Interchain security

Consumer chain launch support

In addition to Gaia upgrades and security, Hypha supports validators and consumer chain teams in troubleshooting during launches and upgrades both on testnet and mainnet. We are on-call during consumer chain events and provide technical troubleshooting help and coordinate between consumer chain teams and Hub validators using our extensive validator connections.

Technical support

  • Troubleshooting assistance during launches and upgrades using our testnet experience with niche issues like consensus errors, key assignment, etc.
  • Test consumer chain binaries through mock launches to find potential bugs or UX improvements.
  • Review consumer chain teams launch plans to identify issues such as misconfigured consumer addition proposals that could lead to launch failures.
  • Review of post-upgrade issues in testnets to avoid issues in mainnet, such as identifying large resource consumption due to upgrade migrations.

In 2023, we supported Neutron and Stride’s launches and were on-hand for upgrades. In 2024, we will be assisting Aether (currently in the testnet rehearsal process with a target date of January 17) and continue to be in close contact with prospective chains like EntryPoint and Noble.

Process documentation

Documentation for bringing consumer chain launches and upgrades to mainnet is currently scattered and hard to find. In Q1, Hypha plans on recording the technical and social process (e.g., deadlines for code freeze before coordinating validators, best practices for engaging with testnet, recommendations for formatting and communication) as a resource for incoming and existing teams.

Potential challenges

  • Testnet Incentive Program isn’t renewed by AADAO
    • Mitigation: Seeking alternate funding sources, such as community pool or paid testing events/programs
  • Lack of incoming consumer chains leads to few testnet events
    • Mitigation: Creating our own game days, partnering with other organizations and projects to run test events
  • Changes to features shortly before release reduces testing timeline
    • Mitigation: Close communication with Informal dev team
    • Mitigation: Improvising speed and efficiency of automated testing suite

Community input

  • Feedback comes up in our Discord channels (#replicated-security-testnet and #testnet-working-group) during testnet events
  • Hypha publishes qualitative reports on Testnet Wednesdays on the Hub forum.

Input received and incorporated

  • TIP: Adjustments to verifying mainnet validator status so that validators don’t use their mainnet key on testnet at all (even as a wallet)
  • TIP: Delaying Hypha’s validators coming online to better mimic mainnet conditions (one operator doesn’t control the majority of stake) and allow validators time to come online before the chain starts.
  • TIP: Need to find more flexible criteria for participation than signing within 5 blocks – we’re working on using signaling proposals as an attendance sheet for validators who show up and put in the work but don’t sign within 5 blocks (e.g., if there are major unforeseen complications with the event).
  • Setting a standard of upgrades being based on block height, not time. Testnet voting period is so short that it’s possible to set a block height that happens in a 10 minute window and tell validators to be online for it, but this prevents them from using tools like cosmovisor which are common on mainnet. Overwhelming feedback was to communicate a block-height ahead of time to allow for preparation and tools.

Questions and comments from the oversight committee

  • Avril Dutheil: Is pure opt-in not considered as secure as top-n? And which of these options is going to be chosen?
    • Jehan Tremback: We wouldn’t say that pure opt-in isn’t as secure, per se. The difference between the two is that top-n, while it has fewer validators than replicated security, still has a significant proportion of the power of the whole Cosmos Hub. With a high top-n (let’s say over 66%), I think we can basically say the security of the consumer chain is the same as the security of the Hub. So I would be comfortable putting core Hub logic on such a consumer chain. Pure opt-in consumer chains could have just as much security but there is a difference where there’s no guarantee it will stay the same, and so putting core stuff on there is a bit of a different decision.
    • Jehan Tremback: We will be implementing both. Top-n can be added to a given consumer chain by governance proposal. We’ll also be migrating from replicated security to partial set security by changing existing replicated security chains to top-n chains with a very high top-n. They will then be equivalent to replicated security chains.
  • Shane Vitarana: Since a chain with a high top-n is almost identical to replicated security, will top-n chains be candidates for Atomic IBC?
    • Jehan Tremback: Chains using Atomic IBC are not exactly the same as ICS consumer chains. Atomic IBC is a platform for appchains and you will be able to launch appchains written using Cosmos-SDK on it, but under the hood, it’s kind of like all the chains using Atomic IBC are running on one big chain. They process their transactions in parallel so you have the horizontal scalability of multiple appchains, but they share blocks. So it’s not really like you just switch Atomic IBC on for a consumer chain, it’s more like they migrate to the platform. And this is kind of an implementation detail, but Atomic IBC may itself launch as a consumer chain of the Cosmos Hub.
  • Avril Dutheil: Since chains using Atomic IBC share blocks, don’t they some lose flexibility in their block times, consensus settings, etc.?
    • Jehan Tremback: Yes, they do. I would argue that they lose less flexibility than almost any other platform such as smart contract platforms, shared sequencers, or DA layers, but yes, they are less flexible than a standalone chain. And that is going to be one of the main challenges with Atomic IBC: do people developing appchains want to lose that flexibility to gain the atomic composability that Atomic IBC offers.
  • Avril Dutheil: Would Atomic IBC be attractive to rollup candidates?
    • Jehan Tremback: Yeah, it could be. They would be getting more flexibility than they do on a rollup platform like Celestia. Rollups share the same blocks too, but they also have a lot of restrictions on what type of state machine they can use. That’s not completely true because there’s stuff like sequencer confirmations vs DA layer confirmations, but I think it’s generally the case.
    • Jehan Tremback: But I think what’s really critical for Atomic IBC is going to be getting adoption from existing consumer chains, and also for us to expand the set of existing consumer chains using Partial Set Security. That’s where our edge is going to be, by continuing to build momentum. There are so many blockchain platforms these days, there are almost literally more platforms than apps. So it’s a very tough market for any new platform. So making sure that Atomic IBC works for our existing consumer chains so that we can retain momentum is going to be critical for its success.
  • Avril Dutheil: I don’t think that atomic composability will be that important for Stride and Neutron, because contracts on Neutron are already atomically composable with each other, and there isn’t really a big benefit to minting liquid staking tokens with the really tight synchronization that atomic composability provides. There would be a much bigger benefit with two smart contract platforms, let’s say Neutron and Osmosis, because then the platforms could share liquidity and that’s a plus for the ecosystem. So for combining multiple platforms that already have atomic composability inside of them, I think that’s where Atomic IBC is adding a lot of value.
    • Jehan Tremback: Yeah, and someone writing a smart contract isn’t necessarily into launching their own chain. So if they’re going to be launching on Atomic IBC, it makes sense for them to be launching on a platform that’s providing smart contract layer for them.
    • Jehan Tremback: But yea, with blockchain platforms these days and how crowded it is, it’s all about having a head start. So that’s why Atomic IBC isn’t a big part of the roadmap yet this quarter and we are focusing on Partial Set Security. We need to build momentum with ICS to set us up for the next thing.
  • James Hinck: Do you see ICS and the Cosmos Hub as primarily designed for Cosmos chains, and serving the needs of the existing Cosmos community, or are they designed to expand the Cosmos community and position Cosmos and shared security against other platforms that are out there?
    • Jehan Tremback: This isn’t necessarily set in stone, and we could adjust our strategy, but right now we are focused on serving Cosmos chains. That’s because Cosmos as a platform has had a lot of success and there’s a big funnel of chains building on Cosmos SDK and IBC, and we think a good place for our platform to be is to serve applications that are being built as Cosmos chains.
    • Jehan Tremback: And you even see that with Atomic IBC. In some ways, it could be seen as being pretty similar to something like Solana, in that it’s a platform where most stuff runs in parallel but then stuff runs sequentially when needed. But instead of trying to make it feel like a monolithic smart contract platform like Solana does, it’s a platform where you can launch code that would otherwise launch as a Cosmos appchain.
  • Jim Parillo: One thing that Eigenlayer does that’s cool is that you don’t need to have a huge amount of security for a small chain. The security can grow as it needs to. So having that flexibility in ICS [using Partial Set Security] might do two things. Number one, large validators might decide it’s not worth their time and not bother with a smaller chain. But smaller validators might be more willing to take it on because their stake might be a much larger percentage on a small chain. Number two, it might actually help decentralize stake because delegators might start to stake with smaller validators to get a shot at every consumer chain.
    • Jehan Tremback: Yea, another thing that might happen is that there’s some random small consumer chain, and there’s only a few validators on it, and the token randomly pumps. Then everyone else delegated to other validators is gonna be like “wait a second, why’d I miss out on those rewards?” And it’s going to be like a FOMO moment, and you’re going to start getting validators bragging about running every consumer chain. And it’s going to kind of change the narrative, because right now with Replicated Security, at best validators are going to have no complaints and not say anything about it, and people hear the validators that do have complaints. Partial Set Security is going to incentivize validators to talk about how they run every consumer chain and it’s no big deal, so that will flip the narrative a little bit.
  • Avril Dutheil: I had a question for Hypha. Do you think it’s possible to have something similar to what you do on testnet in terms of optimizing validator engagement for upgrades?
    • Lexa Michaelides: Yes, my mind always goes to the communication pathway. Broadening our reach to mainnet validators and preparing good templates, consistent ways of communicating with people, etc. Our flow on testnet is quite strong because the information always passes through one team (Hypha) and we’ve gotten very practiced at it. But the knowledge we gain from coordinating multiple testnets gets passed over to mainnet as we work with Informal, produce content that gets used in mainnet announcements, and are on-hand for mainnet upgrades and events. We regularly participate in retrospectives for mainnet events and bring in our knowledge of what works on testnet and that often ends up being a valuable resource for mainnet events.
15 Likes

Thanks for posting @lexa and thanks to the oversight committee members for their commitment to the Cosmos Hub.

1 Like

Will have a proper read over the weekend, but just want to drop a quick note to say:

  • thanks for keeping your “campaign” promises
  • making this report detailed
  • esp thanks for the q&a
  • and based on the q&a, thanks for oversight committee for not just “rubber stamping”

Confidence restored in Hub’s community funding props (I know you haven’t “delivered” the work yet, but I can see commitment to delivery and accountability).

Special props to @lexa who no doubt was person who probably wrote 80% of this (grateful to the rest of the teams too off course)

2 Likes

Also, since there will be at least 3 more of these + other community funding recipients should write reports too, can the forum get a dedicated section for these? Not just filed under “Misc”?

3 Likes

Nice and detailed report

Are there any BD efforts to onboard Crescent as a PSS chain?

1 Like

Firstly, we want to extend our appreciation for the transparency report provided by the oversight. Given the considerable effort invested in creating this detailed post, we couldn’t help but consider whether having both a summary (similar to this post) and a video recording (or just an audio podcast) for us to review the entire meeting might be more effective. Despite the high level of technicality involved, we are confident that the community would appreciate hearing directly from the teams.

pro-delegators-sign

2 Likes

Maybe to save time and to concentrate Hypha further on their job, someone from the review committee could organize a twitter hangout or whatever once per month to go over all the updates

1 Like

I’ve had some new categories created:

  • Proposal updates (for this sort of post)
  • AADAO
  • AEZ with Neutron and Stride subcategories

Hopefully this tidies things up a bit!

As for recording, we want to be sensitive to the privacy of the committee members. Summarizing the Q&A is not a heavy workload as we need to save that info for guiding our work anyway.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.