Testnet Wednesday Reports

Stride Launch Rehearsal (July 5, 2023)

This Wednesday, Hypha hosted Stride’s second launch rehearsal on the Replicated Security Persistent Testnet immediately after a very smooth Neutron mainnet upgrade :tada:

The testnet program is stewarded by Hypha Worker Co-op and provides a testing environment for pre-launch consumer chains on the Replicated Security Persistent Testnet. This work supports the success of the Cosmos Hub mainnet by catching issues before they go live and letting dev teams and infrastructure providers alike practice with the new technology introduced by RS. Stride will be the second Hub consumer chain to go through launch rehearsals on the RS testnet and the first sovereign-to-consumer transition.

With Stride going live on the Cosmos Hub in two weeks, the stakes were high for this rehearsal. In the last month, Hypha and the Stride team have worked closely to iterate on feedback from our last rehearsal on June 7.

Naming convention clarity

For chains that start off as consumer chains (such as Neutron), a genesis file is provided ahead of time and then modified after spawn time to include the CCV state. This modification can only happen after spawn time, as the CCV state is generated by the provider chain at spawn time, which is set in the consumer-addition proposal.

Chains that transition from sovereign to consumer (such as Stride) already have a genesis file in-use and need to have the CCV state added to the binary in a new file. After some confusion in our first rehearsal, Stride and Hypha have decided to adopt the terminology:

  • genesis file is the file used to originally start the sovereign chain. It does not contain the CCV state.
  • ccv file is the file that is generated after spawn time and which contains only the CCV state. It is added to the binary and does not replace the existing genesis file.

Spawn and upgrade time buffer

In the time between spawn on the Hub and upgrade on the sovereign Stride chain, validators need to generate the CCV state and append it to the pre-existing genesis file associated with Stride, then restart their nodes with the post-transition binary. In our first rehearsal, we only had 2 minutes to do this!

This time, we had about 20 minutes between spawn time and upgrade time, which made it much easier for validators to bring their infrastructure online without scrambling or feeling rushed. Feedback

Having this buffer time (or more) is strongly recommended for Stride’s mainnet launch.

Outreach and communication

In the weeks leading up to this rehearsal, we publicized many announcements via email, Discord, Twitter, and Telegram to bring new validators into the rehearsal.

We currently have 48 active validators on the Replicated Security testnet, 16 of whom are also validators on Cosmos Hub and Stride mainnet. Having these double mainnet validators participate in rehearsal will make sure that some of the most invested stakeholders will be prepared and experienced when it comes time to do it on mainnet!

Because this is the first time a chain will transition from sovereign to consumer, the process is still new to everyone and lots of effort went into documenting the steps to improve validator awareness and smooth out the transition.

  • Detailed post outlining steps in the transition
  • Diagram of the changeover process created by Stride
  • Step-by-step sequence table shared in Discord announcements

Consumer key assignment

We also continued to provide clarity on how the consumer key assignment feature works!

Consumer key assignment transactions must happen before spawn time. For Stride on mainnet, that means the transaction has to occur before July 19, 2023.

If the consumer key is not assigned, validators must reuse the provider key until a relayer is established and a new consumer key assignment transaction can be sent via the relayer.

Get involved with the Cosmos Hub testnet program!

If you have questions, comments, or feedback, we want to hear from you! Connect with us on Discord in the #replicated-security-testnet channel or DM me on Telegram (@lexaMichaelides) or Discord (lexamichaelides).

Are you a validator or node operator who is not yet part of the Replicated Security testnet? Join us!

5 Likes

Conversation of here proposals need voting before approved.

1 Like

We’re going to use this thread as a running log of Testnet Wednesday events!

July 12, 2023

We launched Duality’s second chain on the Replicated Security testnet on July 12. duality-testnet-1 will remain online as Duality’s persistent test chain. The launch event took about one hour from proposal passage until interchain secured.

Shortly after spawn time, we ran into a JSON unmarshalling issue that caused the Duality binary to panic. After a bit of digging, we discovered that the app_state.group value had been incorrectly defined as an array of objects, rather than one object. With the aid of a few helpful validators, we fixed the bug, published an updated final genesis file, reset the node state, and started seeing blocks.

Today’s launch also reminded us to always run genesis files through jq before generating checksums. Without standardizing the format of genesis files with jq, whitespaces and differently-ordered keys will lead to different checksums, even though the data is the same. This is especially important considering that some validators build their own final genesis files after spawn time.

We unfortunately did not manage to have enough time to upgrade pion-1 today, so we will be upgrading the Neutron persistent chain at a later date.

Thank you to all participating validators, and to the Duality team for their patience with having this rehearsal rescheduled a bunch of times. Keep an eye out for Duality’s on-chain proposal coming soon!

Validator shout outs :raised_hands:

  • KrEwEdk0 from CitadelOne for their help debugging the consumer chain genesis file
  • 20/20 from VirtualHive and KrEwEdk0 for sharing their address books, getting many other validators online quickly
  • James from LavenderFive for flagging the bug with the final genesis file early
3 Likes

July 26, 2023

We had THREE testnet upgrades in one day! This is the highest number of operations performed in one day so far.

  • Upgrade theta-testnet-001 to v11
  • Upgrade provider to v11
  • Upgrade pion-1 to v1.0.4

The Gaia upgrades on the Theta testnet and the provider chain went smoothly. Both were major version upgrades, and these were coordinated upgrades via governance proposal. In both cases, the chain came back online within a few minutes after hitting the upgrade height. pion-1’s block height also drifted forward a bit, enabling us to start and conclude all three events in under 90 minutes. Snappy!

Coincidentally, we observed an issue with Hypha’s validators around the same time as the first scheduled upgrade. On the Replicated Security testnet, the nodes that were running Stride ran out of memory. This led to Hypha’s Stride validators going offline for a bit, which led to a temporary loss of consensus on the provider chain. Thanks to alerting from some proactive validators, we were able to provision more memory and recover in time for the Gaia upgrade.

Validator shout outs :raised_hands:

  • Bosco from Silk Nodes for posting step-by-step instructions for the Gaia upgrade
  • Alan from Nodestake for promptly uploading post-upgrade snapshots for both the Theta testnet and the Replicated Security testnet
2 Likes

Wednesday August 23

The last few weeks have been pretty quiet on the testnets! Here’s a quick roundup.

August 9, 2023:

  • No coordinated Testnet event, but all validators were asked to upgrade their Gaia version from the v11 release candidate to the official v11 release.

August 16, 2023:

  • No testnet event, but we upgraded mainnet to v11.

August 23, 2023:

  • Upgraded release testnet from v11 to v12-rc
    • Upgrade height at 13:41 UTC; blocks at 13:47 UTC
  • Upgraded provider chain from v11 to v12-rc
    • Upgrade height at 14:20 UTC; blocks at 14:25 UTC

Very boring and smooth upgrades all around! Thanks everyone for your help in making this another successful Testnet Wednesday :ribbon:

2 Likes

Composable joins the Replicated Security Testnet today! It’s been over two months since we last launched a consumer chain, so today’s rehearsal involved a bit of shaking off some dust. The Composable testnet chain, banksy-testnet-3, will remain online for about a week so the dev team can run some tests.

Today’s rehearsal timeline was as follows (all times in UTC):

  • 14:52 - Composable upgrade proposal submitted
  • 15:03 - First cc addition proposal submitted on provider chain
  • 15:22 - Composable upgrade proposal passed
  • 16:12 - Composable upgrade height reached, chain paused
  • 16:15 - Consumer chain banksy-testnet-3 offboarded from provider chain
  • 16:20 - Second cc addition proposal submitted on provider chain
  • 16:27 - Spawn time for banksy-testnet-3
  • 18:12 - Enough Composable validators upgraded to reach quorum, Composable chain starts the “three block” countdown
  • 18:23 - Blocks on consumer chain
  • 18:29 - Relayer up, interchain secured

We had to resubmit the consumer-addition proposal on the provider side because > the proposal was initially configured with incompatible parameters for the chain ID suffix and the revision number (“banksy-testnet-3” and revision_number: 1). For consumer chains, the integer suffix of the chain ID must match the revision number ("chain_id": "banksy-testnet-3" and "revision_number": 3), otherwise IBC connections cannot be established. Before resubmitting, we had to offboard the improperly-configured banksy-testnet-3, which also triggered a cleanup of all consumer key assignments, so we asked validators to temporarily provision their validator keys to help get blocks going. We’re updating our runbooks to make sure we don’t run into this hiccup again!

We learned a lot from today, and we’re glad that we were able to pull together a rehearsal on a relatively tight timeline for the Composable team. Looking ahead towards the Composable mainnet process and potentially future rehearsals, we’ll be sure to align earlier on communications strategy and the exact timings of when each step will happen. Every practice rehearsal makes us better at coordinating and operating, which will help Cosmos Hub become a bigger, more secure, and more inclusive ecosystem. Massive thank you to all the Cosmos Hub Testnet validators and the Composable validators who participated in today’s event!

Shout outs:

  • KrEwEdk0 from CitadelOne for helping to answer questions and debug in both the Cosmos Hub Discord and the Composable Discord
  • 20/20 from VirtualHive for teaching us new things about jq today
4 Likes

Fantastic news to hear

This Wednesday marked the start of our first period for the Testnet Incentives Program (Nov 22 - Dec 20, 2023).

Today we performed a minor upgrade from v14.0.0 to v14.1.0 (forum post for this version here). v14.1.0 contains cryptographic equivocation code to automatically process equivocation evidence and slash offending validators without the need for an equivocation proposal via governance.

We will be testing the consensus-breaking piece of code in this upgrade on Friday November 24, 2023 at 15:00 UTC.

:warning: Nodes that are still running v14.0.0 at this time will have apphash errors. To avoid this, make sure you have upgraded to v14.1.0 before Friday. :warning:

The Testnet Incentives Program is a pilot program, so we are continuing to gather feedback on process, criteria, etc! Some of the feedback we’ve received:

Verifying mainnet validator status

Validators are telling us: Testnet validators want a way to verify that they operate on mainnet without using their mainnet key on the testnet, for ease-of-completion, operational security, etc,

:saluting_face: We updated the Google form to allow verification using only mainnet keys on mainnet.

Hypha’s validators decide when the upgrade comes online

Validators are telling us: Hypha’s testnet validators hold majority voting power, so the chain starts moving as soon as Hypha comes online rather than when the majority of the decentralized set comes online.

:saluting_face: Longterm, we’re looking to decentralize the testnet more but we need confidence that folks are paying attention and keeping the network online. In the meantime, we’ll commit to waiting 15 minutes after the upgrade height to bring the Hypha fruit validators online :apple: :banana: :cherries:

Signing window is too tight

Validators are telling us: 5 block window is too tight for several reasons (Hypha’s validators holding majority voting power, conflicts with mainnet upgrades or other important testnets, bootstrapping time).

:saluting_face: In addition to the previous fix, we’re looking into more flexible criteria that will reliably recognize teams that are present and responsive while not punishing reasonable latency. This is a difficult balance to strike, so please bear with us!

For now, we’ve decided to remove the 5 block signing window for minor and patch upgrades.

Shout outs to:

  • Toschdev from SG-1, Claimans from CryptoCrew for feedback on verifying mainnet validator status.
  • Freak12techno from Quokka Stake, Flo from CrowdControl for feedback on Hypha’s big validators determining the upgrade speed.
  • Manueldb from Stake&Relax for sharing detailed steps on setting up manual upgrades.

As a reminder, the eligibility criteria for the Testnet Incentives Program are:

Network Criteria Proof
Mainnet Be an active mainnet validator on the Cosmos Hub. Mainnet validator must send a tx with a memo containing to cosmosvaloper address of their testnet validator and submit proof via Google form.
Mainnet Validate all available consumer chains secured by the Cosmos Hub. Hypha checks mainnet validators and blocks to confirm.
Testnet Remain unjailed for the duration of the period on the provider chain. Hypha testnet recorder reports jailing events.
Testnet Sign the fifth block after every scheduled Testnet Wednesday event in the period (consumer chain launch, chain upgrade, chain re-launch, etc). Hypha testnet recorder reports on block signing during and after events on Wednesdays.
Testnet Run mainnet-grade infrastructure (nodes, sentries, etc) on all chains. Disclosure from operators via Google form.
1 Like

November 29, 2023

Last week, we upgraded the pion-1 chain to v2.0.0.

This upgrade contains a lot of migrations, which made it a very long upgrade (which was to be expected). Hypha’s validators used up an extra 100 GB of disk space during the upgrade. This and memory resources are major considerations for mainnet – validators will need to make sure they have enough disk space and memory to perform the upgrade!

Min gas price requirement

  • v2.0.0 makes minimum-gas-prices a mandatory parameter in app.toml. This means that a value must be given, even if it is “0untrn”.
  • The recommended value is 0.02untrn.
  • Validators who had not set this parameter saw errors after upgrading.
  • Some validators reported that their node would not start if there was only an empty string in this field.

Time- vs height-based expectations for testnet upgrades

Context:

  • Software upgrades are technically height-based, and various tools are used to predict the time at which a particular block height will occur.
  • On mainnet, the voting period for governance proposals (either via governance or a subdao) are typically quite long and validators know that the upgrade block height may drift significantly.
  • On testnet, voting periods are very short to accommodate testing. This means that we are not as vulnerable to drift, and coordinators have the opportunity to accurately pick a block height based on a time, rather than trying to predict a time based on a block height set weeks in advance.

pion-1 upgrade:

  • Originally, the pion-1 upgrade was targeted for block 7998200 which was expected to occur at 15:30 UTC.
  • Due to block drift, the Neutron team anticipated that block 7998200 would occur 40 minutes later than expected and decided to use block 7997000 instead, to align with the target upgrade time of 15:30 UTC.
  • This switch was not well communicated to testnet participants, so there were issues with preconfigured scripts breaking and online participants needing to make changes on the fly.

On the RS testnet, we want to recreate mainnet conditions as closely as possible. We will work to set the following standards for upgrades:

  • Upgrades are based on a block height.
  • Block heights will be chosen and communicated early enough that operators can automate and use scripts (especially if this is what operators do on mainnet).
  • Validators are still expected to use monitoring setups and be available to step in if the scripts or automation fails.
  • Once communicated, block heights will not change except in emergencies and when we can do sufficient comms to make sure everyone knows what is happening.

Testnet incentives program

  • Hypha waited 10 minutes before bringing the fruit validators online, giving all other validators time to peer before bringing the chain online. Because this upgrade was already anticipated to take a long time (quoted as 1 hour in the Neutron PR), we’re considering that we could have waited even longer before bringing the fruit validators up. We’re continuing to iterate on the appropriate time to bring our vals online, considering the voting power they hold.
  • Given the communication issues around this upgrade, we’ve decided that the most lenient reasonable criteria is to be up and signing by block 7998210 (10 blocks after the originally communicated upgrade height, 2:12 hours after the scheduled upgrade time). 45 validators hit this criteria.

Some other TIP notes:

  • Validators must be in the active set at time of payment to be eligible. If you are not currently in the active set, you might still be eligible at payment time so keep participating!
  • Hypha checks that validators have remained unjailed for the full period at time of payment.

For better visibility into the eligibility process, this spreadsheet will be updated after each Testnet Wednesday.

1 Like

@lexa one more point I raised on Discord: is it possible to launch an upgrade proposal on chain way more earlier?
My reasoning: I use a monitoring that generates metrics for all my nodes if there’s an upcoming upgrade (e.g. proposal that has passed) which I use for building 2 alerts: 1) to fire if I do not have a binary prepared in Cosmovisor folder, and 2) if an upgrade is in less than 30 minutes, to remind me to be present.
If I made a mistake somehow (for example if I put the binary into a wrong folder), or for example if the proposal name somehow doesn’t match the one specified in README/instructions provided by the team (so the binary should be put into a different folder), I won’t have enough time to react and fix things accordingly.
On mainnet there’s usually a big timespan between the date the proposal is accepted and the actual chain upgrade, this time on testnet it was merely a few minutes, so here 1) it doesn’t match the mainnet conditions and 2) it leaves really short time for validators to react.

Do you think considering what I wrote above it’s possible to submit such proposals, let’s say one day before the estimated time of the block it’s supposed to be executed on?

The upgrade proposal for consumer chains is managed by the consumer chain team, which is how it would be on mainnet. I can work with them on committing to putting on testnet-chain earlier. You’re right – we want to be letting people use scripts and monitoring the way they do on mainnet, and mainnet gets way more lead time than testnet.

Occasionally we do try to run two events on the same day, in which case I’d aim to have the highest priority event submit their proposal earlier and then have the second event waiting in the wings once we know our block drift won’t cause the events to collide.

Really appreciate the feedback :slight_smile:

1 Like

Today’s Testnet Wednesday was delayed as validators and testnet coordinators were on deck to execute and support an emergency upgrade for mainnet neutron-1.

neutron-1 emergency upgrade

The Neutron team coordinated an emergency upgrade for neutron-1 to include a security-critical patch to cosmwasm.

The Neutron team used a two-stage plan:

  1. Distribute a halt height to the validator community
  2. Distribute the upgrade information, including the new release binary link

By distributing the halt height first, they allowed validators to prepare their nodes and be aware of the upcoming upgrade without even having a new release available.

neutron-1 halted at 10:04:35 ET and returned at 10:08:05, after a 3m 30s downtime.

:clap: amazing work all around, especially from the Neutron team coordinating this upgrade!

pion-1 emergency upgrade

  • The pion-1 upgrade was scheduled right after it was determined that the mainnet (neutron-1) chain upgrade was complete
  • The halt height was set to 9723450, at approximately 16:45 UTC (11:45 ET)
  • A signaling proposal was submitted to the provider chain of the Replicated Security testnet to use this event as eligibility criteria for the Testnet Incentives Program

Timeline in ET:

10:52: cosmos/testnets repo is updated with v2.0.1 upgrade information

11:13: Proposal 93 is submitted to the provider chain to serve as a record of Testnet Incentives Participation

11:14: Proposal 93 enters the voting period

11:21: tesnet-announcements post: notify ReplicatedSecurity that pion-1 will be upgraded, includes new version, halt height, and links to the pion-1 upgrade page and proposal 93

11:47: Chain halts at 16:47:41

12:03: Hypha starts apple, banana, and cherry nodes back up

12:14: Chain starts again at 12:14:21, after a 26m 40s downtime

We had some confusion over settings from pion-1 minimum-gas-price due to differences between mainnet and testnet. This was resolved – pion-1 minimum-gas-price should be 0untrn.

pion-1 testnet upgrade stats

  • Event runtime: 1 hour from proposal 93 going on-chain to pion-1 coming online again
  • pion-1 downtime: 26m (intentionally longer than mainnet)
  • Validators that voted yes on Prop 93: 33 / 60
  • Validators that signaled “upgraded” in the replicated-security-testnet channel: 22 / 60

Next week we will be running a launch rehearsal for Aether and upgrading pion-1 (on-schedule, not emergency :crossed_fingers: ).

1 Like

We had a particularly busy Testnet Wednesday this week. Both Neutron and Aether got their proposals in earlier in the week and we anticipated upgrading pion-1 first at ~15:30UTC and then launching Aether at 16:00 UTC.

Hypha distributed voting power more on Tuesday and this slowed down pion-1’s block time by about 0.2s, making the upgrade time closer to 17:00 UTC. This is expected behaviour – a more decentralized set means that the proposers are more geographically distributed and so communication takes longer. Lesson learned for us – don’t make changes to voting power distribution just before an upgrade.

As a result, we launched Aether’s ethera_9000-1 chain first.

ethera-9000-1 chain launch

46/61 validator participation

  • 13:44 UTC: Hypha announces that we’ll launch ethera_9000-1 at 15:30 UTC
  • 14:00 UTC: spawn time is reached and CCV is generated, validators begin joining
  • 15:30 UTC: Hypha brings validators online
  • 15:32 UTC: Blocks are being produced and chain is interchain-secured

This was a very efficient rehearsal with a few notes:

  1. The Aether chain requires the chain-id flag to be used in the start command, whereas previous chains have not. If launching on mainnet, clear instructions and reminders should be provided.
  2. Validators who do not include the chain-id flag will get a panic error and should use unsafe-reset-all before restarting the node with the chain-id flag used.

pion-1 upgrade

51/61 validator participation

  • 16:53 UTC: Upgrade height is reached and validators start upgrading
  • 17:08 UTC: Hypha brings validators online
  • 17:09 UTC: Blocks are being produced

Very smooth upgrade from the Neutron team!


Other notes for this Wednesday:

  • The Aether team’s codebase is now open and available for perusal here.
  • We had some discussion about communicating times in UTC vs using Discord’s time-conversion tools to automatically report timestamps in a user’s local timezone. I’ll experiment with both going forward; currently I only use UTC.
  • Shout outs to freak12techno from Quokka Stake for the the tmtop tool which now reports on who’s assigned keys :eyes: , James from Lavender.Five for troubleshooting and node operations support in the channel!

Next week (January 24) will be the end of TIP period 2! We will be either running the v15 upgrade or a game day of some kind :thinking: stay tuned!

6 Likes

We ran our first-ever game day :game_die: on the testnet today!

In this game day, we investigated two new ideas on a brand new chain:

  1. Cancelling a software upgrade proposal using a governance-gated cancel-upgrade proposal
  2. Skipping an upgrade using the --unsafe-skip-upgrades <block-height> flag

We had 39 validators participating today on this new gameday01 chain.

cancel-upgrade

For our first event, we submitted a proposal to upgrade from Gaia v13.0.2 to v14.0.0-rc0 and then cancelled it using a cancel-upgrade proposal which validators had to vote on.

31/39 validators voted successfully on the cancel-upgrade proposal and we avoided upgrading to v14.0.0-rc0.

However – this is not a realistic scenario for the Hub. Because the Cosmos Hub voting period is 2 weeks and we usually upgrade roughly 1 week after an upgrade proposal passes, there is no time for a cancel-upgrade proposal before the upgrade height arrives.

On mainnet, it’s much more practical to just skip the upgrade height.

unsafe-skip-upgrades

For our second event, we pretended that the issue had been resolved and we were ready to upgrade to v14.0.0 via a software upgrade proposal.

As upgrade height approached, validators were informed that this upgrade was faulty as well, and everyone would need to restart their nodes using the --unsafe-skip-upgrades 31156 flag to keep running their node with v13.0.2.

Validators who successfully skipped the upgrade height would continue signing blocks as usual on gameday01, while validators who did not skip the upgrade height would apphash and stop signing. Of the 40 validators, 34 successfully applied the flag in time and were able to keep signing without interruption.

In a real-world scenario, it is perfectly fine for a some validators to halt so long as > 2/3 of the set successfully applies the flag.

Lessons learned

  • 39 validators participating
  • 23 validators voting on prop #1 (not a TIP criteria because we didn’t announce it in time :sweat_smile:)
  • 31 validators voting on prop #2 (TIP criteria)
  • 34 validators voting on prop #4 (TIP criteria)
  • 34 validators still signing after the upgrade height
  • Exactly 1 hour event run-time :tada:
  • Shout outs to James from Lavender.Five for snapshots, Benjamin from Blockscape for helping identify a potential cosmovisor bug now being investigated here.).

Some useful things that we learned along the way:

1. A list of voters can be obtained even for a proposal that is no longer active using a block height:

REST API:
curl “<API URL>/cosmos/gov/v1beta1/proposals/<proposal id>/votes?pagination.limit=1000” -H “x-cosmos-block-height:<height>”

Gaia CLI:
gaiad q gov votes <proposal id> --height <height>

2. For unusual proposal types, sometimes block explorers don’t render the information correctly.

We were using an old version of Ping Pub to display the cancel-upgrade proposal , but this issue is fixed in the newer Ping Pub. Still – it’s important to know how to view proposal info using the CLI as well!

3. How would this work on mainnet?

Testnet is a critical place not only for testing code, but working out best practices and emergency situations.

On the Hub, if a critical bug is detected in an upcoming upgrade, Hypha’s recommendation is:

  1. Communicate to all validators to restart their nodes using the --unsafe-skip-upgrades <block-height> flag. This information should be pushed via the Discord #validators-verified channel, the email list, and personal Telegram, Discord, and Signal contacts.
  2. As upgrade height approaches, advise validators to take snapshots of their nodes before the upgrade height. In case something goes wrong, having a recent snapshot can be critical to recovering the chain.
  3. After upgrade height, validators who have successfully continued signing should take another snapshot to share with the network if necessary.

Snapshots can be incredibly resource intensive, so we would also recommend that validators run more powerful machines as upgrade height approaches.

TIP period 2 ends

This Testnet Wednesday also marks the end of TIP Period 2. A more thorough report will be posted to the Testnet Incentive Program thread as payments are sent out :slight_smile:

As before, eligibility can be tracked here.

2 Likes

February 7, 2024

Lately there has been a ton of spam on the Hub getting vetoed. Historically, it hasn’t been really clear which outcomes lead to a deposit being burned so we decided to do a demo day of different proposal outcomes.

Proposals on the Hub can have five outcomes:

  1. :white_check_mark: PASS: Quorum achieved, >50% of participating voting power votes YES
  2. :x: REJECT: Quorum achieved, >50% of participating voting power votes NO or NWV
  3. :skull_crossbones: VETO: Quorum achieved, >33.3% of participating voting power votes NWV
  4. :person_shrugging: INCONCLUSIVE: Quorum achieved, neither YES/NO side achieves >50% of participating voting power
  5. :cricket: NO QUORUM: Quorum not achieved

We ran five different proposals, each intended to showcase a different voting outcome and how the deposit was handled. Each proposal was submitted by a unique wallet that was initially funded with 11 ATOM (for a 10 ATOM deposit on each proposal).

Kinda neat to see all those voting options show up so clearly in the explorers!

Outcomes resulting in a returned deposit:

:white_check_mark: PASS

Wallet cosmos1mrwtsv7p53k90ey2nej4glsv3gphujkh8fr0mx contained 11 ATOM after proposal 101 passed.

:x: REJECT

Wallet cosmos10d48f9w5e86r74j7jnhw7hu489649jzcz4hqdn contained 11 ATOM after proposal 102 passed.

:person_shrugging: ABSTAIN

Wallet cosmos18vkqj9ha0h98nummd7z9z42ff56ktmz0st4yye contained 11 ATOM after proposal 104 hit quorum but was neither passed nor rejected.

Outcomes resulting in a burned deposit:

:skull_and_crossbones:: VETO

Wallet cosmos17nrgds90as3ccmlvxpsfamaw44l68lnh94cexr contained 1 ATOM after proposal 103 was vetoed.

:cricket: NO QUORUM

Wallet cosmos122fe029kyunndwygyu7cwukleueau6layp5ejn contained 1 ATOM after proposal 105 failed to reach quorum.

Lessons learned

We had 38 validators participating in this event!

Some validators reported learning the following:

  • Deposits are burned for proposals which fail to meet quorum.
  • Voting can be changed any time during the voting period. This means that you can change your mind so long as the voting period has not ended!
  • Some explorer UIs hide proposals which are being overwhelmingly vetoed, as a spam prevention measure.
3 Likes

February 14, 2024 :two_hearts:

v15 is coming to the Hub! This is going to be a big one because of state migrations (executing on decisions made in proposals 826 and 860) and moving to sdk 0.47.

On the testnets, we upgraded theta-testnet-001 on Tuesday and did provider today to ease the load on validators operating on both networks. We had 41 validators participating on provider.

  • Theta runtime: 55 minutes
  • Provider runtime: 41 minutes

Lessons learned

Major feedback from operators who have now gone through this upgrade process at least once (some went through it twice):

  • Don’t panic at the long wait and lack of updates in the logs! Logs are not very verbose and it will look like nothing is happening, but just be patient. Hypha and Informal are working with the Comet team to see about getting more informative logs during upgrades in the future.
  • Do not restart your node mid-upgrade unless you’re seeing OOM errors. Having learned from our big Neutron upgrade, we know that restarting mid-upgrade can make the situation worse and lead to a corrupted database.
  • Set min gas price to 0.005uatom or your node will fail on restart. Failing on restart just means that you’ll have to set the param at that point, but doing it ahead of time will be smoother :butter:

Our testnet work today fed directly into the mainnet upgrade recommendations, such as ensuring that some validators will be taking snapshots to help validators who aren’t able to meet the hardware requirements required for mainnet state migrations.

2 Likes

February 21, 2024

Today we did a coordinated upgrade to bump provider to rc1, then went through a changelog review game day.

Minimum deposit ratio

Deposits to proposals in v15 must be made according to a minimum deposit ratio of 0.01 (i.e., the minimum amount that can be sent to a proposal in the deposit period is 10% of the total minimum deposit).

Parameter change proposals

  • Gaia-specific parameters such as globalfee can be changed can still be changed via a submit-legacy-proposal and take effect.
  • Parameters from the sdk (such as those within the staking module) cannot be changed using submit-legacy-proposal
  • Parameters from the sdk can be changed using submit-proposal

However, parameter change proposals now require the submitter to list all parameters in the module and the value of each parameter. Previously, parameter change props could be made and name a single param to adjust.

This represents a risk to the voting process as validators and voters need to confirm that a parameter change proposal does not accidentally or maliciously change a parameter which isn’t clearly communicated in the proposal text.

This can be observed in Proposals 112 and 113 on the testnet. Both proposals claim to set max validators to 185, but one proposal sneaks in an additional change (increasing min commission from 5% to 15%)

Demonstration events

We also shared some info about additional changes in this release:
Gaia now outputs to stdout instead of stderr
tx bank multi-send allows an account to send to multiple recipients in one message
q bank spendable-balances makes it easier to find out amounts available in vesting accounts
q bank denomowners lets us find which accounts have a specified denom

TIP results

We expected to have 5 TIP criteria in this event but we didn’t wind up tagging @Replicated Security for two of them, so we have reduced the event to three TIP criteria:

  1. Make a deposit to Proposal 109
  2. Vote YES on Proposal 110
  3. Vote YES on Proposal 111

Additional lessons learned/feedback

  • Integrator tools like wallets, block explorers should work on displaying the previous and proposed value of parameters. This would make it much easier to evaluate new param change proposals.
  • MsgCancelUnbondindDelegation is also included in v0.47 sdk and can be used in v15 (example from Quokka Stake here)
  • Deposits can still be made to proposal that are in voting period, but it would be weird to do that. What’s the point? :thinking:
1 Like

Unsure if this is the place to do this, but i want to say cudos again to the Hypha team here. Our (Citizen Web3) Devops has been visiting the events for 2 weeks and has only the most positive feedback about what is being done there. If you want to improve your understanding of gaia/gaiad, this is a must!

2 Likes

I second the statement above, as I’ve been a testnet member for over than 2 months already.

Firstly, it’s really a great playground for validators to learn to deal with things in unexpected consequences and therefore be more prepared for things going wrong way on mainnet.

Secondly, it’s also a great way to see the features that would soon land on mainnet beforehand and know what’s coming in the future onto a mainnet and how to use it.

Lastly, I cannot say anything except the positive things on how the whole thing is coordinated, both from Hypna’s side (as they were extremely helpful and willing to listen to validators’ suggestions on improving the whole project and also willing to help those in need) and from validators’ side (everybody is super helpful and I’ve seen a lot of validators helping others in need).

Happy to be the part of it and hope some of my support was also helpful. Way to go, guys!

2 Likes

March 6, 2024

With two emergency upgrades this week, I’m a bit late getting this report out :face_exhaling:
Fortunately both Neutron and Hub upgrades went super smoothly – hardly a noticeable halt!

On the testnet front, we had a smooth scheduled upgrade of pion-1` to v3.0.0. The upgrade itself took < 30 minutes and we had 35 validators participating. We also had a new Hypha member running the show behind the scenes with Dante off on a much-deserved vacation :tada:

3 Likes