Cosmos-SDK & IBC Vulnerability Retrospective: Security Advisories Dragonberry and Elderflower (October 2022)

On October 8, 2022, a small group of Cosmos engineers led by Dev Ojha (@Valardragon) of Osmosis began an intensive security review of the ICS-23 implementation. This review took place as a response to the BSC incident that impacted Binance Chain on October 7, 2022, and focused on areas where ICS-23 interfaces with proofs generated by the IAVL Merkle tree used by the Cosmos SDK.

As a result of this bug hunt, two separate security vulnerabilities were identified in the Cosmos stack.

  • The first vulnerability, Dragonberry, originated in ICS-23 (IBC) and enabled the forgery of IBC timeouts. Timeout forgeries could be escalated to ICS-20 doublespend.
  • The second vulnerability, Elderflower, originated in Authz (Cosmos-SDK) and enabled bypassing parts of the Cosmos SDK message authentication system.This authentication bypass could potentially be escalated to inflation, theft, or other exploits depending on chain-specific implementation details.

These vulnerabilities were patched when core devs across IBC-connected chains distributed the Dragonberry patch for Cosmos SDK. In this post, we would like to provide details about both incidents as well as the next steps for the security of the Interchain.

Issue #1: Dragonberry

Issue timeline

On Sunday, Oct 9th around 4pm CST, Dev Ojha discovered a vulnerability that would allow forged absence proofs to be accepted by IBC. Within a few hours of discovery, Dev disclosed the details of his findings to Ethan Buchman (Informal Systems) and Aditya Sripal (Interchain GmBH) for validation of the issue. After it was reproduced, a Dragonberry Response Team came together to begin assessing paths for remediation and resolution.

While the discovery of a security vulnerability is not a security incident, the recent exploitation of BSC pointed to the high likelihood of a knowledgeable attacker who could independently find and exploit the Dragonberry bug. Due to the risk of a bug collision here, the patching process for this critical bug was treated as if it were an active incident to reduce the window of exploitation.

The Vulnerability

IBC has a user configurable timeout mechanism that allows reclaiming funds in the event that a packet is not relayed within a certain time. Given the ability to forge a proof of absence (ie. a false proof that a packet was not received), an attacker could trick the IBC protocol into believing that IBC transfer both succeeded and failed.

This vulnerability could have been exploited in order to iteratively drain all ICS-20 escrow accounts. This effectively means that all funds in IBC channels across Cosmos networks were vulnerable to this exploit.

IBC is a trustless protocol to establish connection between two machines, where the details like Merkle tree verification can be different for different chains. IBC’s transport security relies on Merkle proofs of existence and absence. IBC employs a descriptive language of Merkle proofs as part of the ICS-23 specification; this language and specification lacks a definition of soundness. This means it might be possible to provide a ICS-23 proof of absence or existence that does not exist in the original Merkle tree.

Mitigation and Remediation Timeline

The Dragonberry response team developed a patch for this vulnerability which performs additional validation on the structure of ICS-23 IAVL proofs, subsequently ensuring invalid proofs are rejected. Once this patch was developed, the team engaged in ad-hoc coordination with validators and began the process of distributing the patch and an advisory to all core teams across the Interchain.

On Monday, Oct 10, validators across all chains in the Cosmos ecosystem began to deploy this patch. By the end of the first day of coordination, +⅓ of the voting power on a majority of Cosmos SDK powered chains confirmed that they had mitigated the risk of exploitation on their chains, with some core teams aiming to choose for a higher patching threshold. Once +⅓ of a network’s voting power was upgraded, any attempt to exploit the vulnerability would be visible as it would result in non-determinism that halts the impacted network. Though a network halt is typically not ideal, simply put, a halted network is better than a network that is both vulnerable and exploitable to a sufficiently resourced, capable attacker.

Once a network reached a 100% patching threshold, the Dragonberry vulnerability would be fully remediated and resolved. Since this discovery, the IBC teams at Informal and ICF have continued their intensive reviews of the ICS-23 code and spec, and they are working to include soundness validation as a crucial part of the specification.

Issue #2: Elderflower

While the patch development and distribution for the Dragonberry issue was underway, a second vulnerability from a Cosmos contributor was reported to the Dragonberry Response Team.

Incident Timeline

On Tuesday, October 11th, a second, unrelated critical vulnerability in the Authz module of the Cosmos SDK was reported by Alex Peters from the Confio team via an official vulnerability disclosure channel. Alex also disclosed the details of this bug to Dev, who was able to reproduce and validate the issue. Additionally, core developers from the Axelar team independently discovered and reported this same issue to Dev and Sunny of the Osmosis team within 24 hours of the initial discovery. As multiple parties had independently identified and reported this vulnerability, the likelihood of a bug collision with an attacker knowledgeable about the workings of the code seemed significantly higher than it was with the first Dragonberry issue.

The Vulnerability

The Cosmos SDK is composed of many modules. Within these modules there is the ability to execute messages. As part of the execution, modules should check that the message is valid via the ValidateBasic() method.

It was reported that the message validation pipeline for Authz was missing a critical call of ValidateBasic() in Cosmos SDK v0.44, 0.45 and 0.46.

Though previous reviews of this code had not identified this, it had already been fixed on the main (unreleased) branch. The missed ValidateBasic() could create invalid state transitions that might allow inflation or theft with the perfect transaction parameters.

Mitigation and Remediation

Though Cosmos SDK chains had already reached the mitigation threshold of +33% within a day of ad-hoc security coordination, the official public patch for Dragonberry had not yet been released. As the Response Team was working on the official patch release for Dragonberry when Elderflower was disclosed and multiple parties had independently identified and reported the issue, they opted to simplify the patching process by combining the fixes for both issues in a single release. This decision enabled the Response Team to focus on validating the patches and closing the window of exploitation as quickly as possible, and releases of the Cosmos SDK including patches for both issues were cut on Friday, October 14th, 2022.

Throughout the following weekend, the Response Team worked with Cosmos chains to ensure they upgraded to the official public patch that would fully remediate the Dragonberry and Elderflower issues. Once Cosmos chains applied the patch, the vulnerability was fully resolved.

Retrospective and Next Steps

While no funds were lost as a result of Dragonberry or Elderflower, this series of security issues was a close call. In discussions that have taken place among core developers since the disclosure and remediation of these issues, core contributors have identified several opportunities to improve application security and security coordination across the Interchain in the future that include:

  • Improving decentralized development processes. Since Cosmos development spread across several teams in 2020, the ecosystem and Interchain have grown significantly. This growth has made it more complex to scale the processes that support proactive prevention, detection, and response to bugs and incidents in this complex, interconnected system.
  • Building out robust security operations including an incident response and coordination mechanisms that reach every corner of the growing Interchain.
  • Renewing investments in core security programs across critical domains that include product security, vulnerability assessment, and observability.

At Cosmoverse, Ethan Buchman announced the formation of a Technical Advisory Board (TAB) for the Interchain Foundation. One purpose of this TAB is to improve the resilience of the Cosmos, and to ensure that key components of the stack receive more robust security treatment before making it into production. The first step to achieving this is to invite back some of the strongest participants back into a united Cosmos.

Moving forward, the TAB will invest in security by focusing on the development of robust security programs and coordination mechanisms. In addition to this, the TAB will drive sustainable, mission-driven security culture, prioritize the implementation of secure development lifecycles (SDLC) and world-class application security treatment throughout the Cosmos SDK development process. The TAB will also support the development of a scalable, robust incident response program to coordinate with core teams and validators when security issues and incidents arise in the future.

The nature of this incident in particular involved the collaborative efforts of a large number of individuals. We want to thank the teams and individuals involved:

Aaron Craelius (Regen Network)

Adam Tucker (Osmosis)

Aditya Aripal (Interchain)

Amaury Martiny (Regen Network)

Aleksander Bezobchuk (Interchain)

Ari Rubinstein (Agoric)

Billy Rennekemp (Interchain)

Carlos Rodriguez Vega (Interchain)

Colin Axner (Interchain)

Daniela Pavin (Interchain)

Damian Nolan (Interchain)

Dev Ojha (Osmosis)

Ethan Buchman (Informal Systems)

Ethan Frey (Confio)

Jehan Tremback (Informal Systems)

Jelena Djuric (Informal Systems)

Jessy Irwin (Agoric)

Jon Sparks (Informal Systems)

Josh “Dogemos” (Keplr)

Julien Robert (Interchain)

Marko Baricevic (Interchain)

Matt Park (Osmosis)

Niccolo Raspa (Osmosis)

Nicolas Lara (Osmosis)

Romain Ruetschi (Informal Systems)

Shoaib Ahmed (Informal Systems)

Roman Akhtariev (Osmosis)

Sunny Aggarwal (Osmosis)

Thyborg (Informal Systems)

Zaki Manian (Iqlusion)

Zarko Milosevic (Informal Systems)

As part of our Coordinated Vulnerability Disclosure Policy, we operate a bug bounty program with Hacker One. If you believe that you have found a security vulnerability in the Cosmos stack, we encourage you to report it to our program which was among the first in the blockchain space to offer Safe Harbor to security researchers.

18 Likes

:heart_eyes:

typically, something like this is found after an exploit

many thanks to everyone who worked to identify aqnd patch this issue before it was exploited, and to the 60 or so chain teams that got patched code running and released, and the 300ish ecosystem validators who applied the patched code.

I think that we accomplished something truly special here.

10 Likes

Thanks for everyone’s hard work

1 Like

Woot woot. Thanks peeps. Good job

1 Like