CHIPs discussion phase: Atomic IBC Megablocks

Atomic IBC is a proposed direction which would allow consumer chains to have much deeper integration with each other, in the form of being able to synchronously process bundles of transactions between consumer chains, while processing all other transactions in parallel.

Atomic IBC’s optional synchronous processing of bundles transactions gives users the ability to access functionality that has much better UX (complicated cross chain workflows can complete in 5 seconds instead of 2-3 minutes), and gives developers the ability to develop this functionality with much less code (asynchronous cross chain workflows must be able to handle any one of their steps failing or timing out, and the state of the chains involved changes during the workflow).

In short, Atomic IBC could allow for the UX of a smart contract platform, with the scalability of a multi-chain system. It also provides a point of differentiation for the Hub in a likely future where shared security is commoditized.

Megablocks vs other approaches

There are several different architectures that could be used for Atomic IBC. The first is Megablocks, where all participating consumer chains are effectively run as one chain, with “mega” blocks that contain the blocks for all the chains. Atomic bundles, which are processed synchronously, go at the beginning of these blocks. The rest of the transactions for each chain are processed separately in parallel.

This has several strengths:

  • It is a simple design, and relatively easy to understand.
  • Every block has the possibility of containing an atomic bundle, with transactions across all participating chains.
  • The tight coupling of chains should allow higher degrees of atomicity and isolation of transaction bundles than possible with more loosely coupled designs.
  • Since all participating chains share a Comet instance, Megablocks may reduce per-chain infrastructure cost when compared to replicated security.

Megablocks also has some drawbacks

  • Since chains share blocks, having custom block times or consensus customizations, e.g. using ABCI++ or Block SDK could be harder.
  • Since all participating chains share a Comet instance, they may hit scaling bottlenecks that wouldn’t be hit if each chain had its own Comet instance. Benchmarking megablock’s scaling characteristics will be a part of the research on Megablocks.
  • If one of the chains has an apphash error, with a naive Megablocks design, all of the chains would have an apphash error. Figuring out how to avoid this apphash contagion will be part of the research on Megablocks.

Another is heterogenous paxos and similar concepts, where multiple distinct chains produce shared blocks sometimes. These shared blocks contain the atomic bundles. This has some strengths:

  • Different chains can have different validator sets, although more overlap results in better performance.
  • The increased separation between chains vs Megablocks may make issues like shared block times or apphash contagion less of a problem.

However, heterogenous paxos-like approaches also have some weaknesses:

  • The need to wait for a shared block reduces the UX benefits of the approach as a whole.
  • Shared blocks will likely lead to inconsistent block times, with shared blocks taking longer than normal blocks on a chain.
  • Execution of shared blocks may be more constrained than in Megablocks, where each validator has access to the entire state machines of both chains. This may lead to lower levels of isolation and atomicity than with Megablocks.
  • Atomic bundles going between more than two chains may be challenging- will shared blocks need to contain every participating chain? In this case, is there still much of a benefit over Megablocks?

Researching Megablocks

We think that the best way to approach these questions is to start building a prototype of the Megablocks architecture, while also investigating other approaches and comparing Megablocks against these approaches.

Research questions to answer:

  • What kind of limitations are imposed by the Megablocks architecture, and how can they be mitigated?
    • Block speed limitations: Is it possible to set a fast “base speed” and allow chains that do not need blocks to be produced as quickly to operate at some multiple of that speed?
    • Consensus customizations: Are any limitations imposed by Megablocks on the use of ABCI++ or Block SDK? What are these limitations, and is it possible to bypass them if they exist?
    • Scaling bottlenecks: Where are scaling bottlenecks hit? How many individual blocks can a Megablock contain? Are there unexpected scaling limitations other than block size?
    • Apphash contagion: Individual consumer chains must be able to experience timeout or apphash errors without taking down every other chain. In theory this is possible, but in practice, it’s a large design space. We will get a better sense of the best way to approach this challenge.
  • How do approaches such as heterogenous Paxos compare to Megablocks?
    • Atomicity limitations: In the Atomic IBC paper (at the bottom), we discuss 5 different atomicity and isolation attributes which can be achieved by Atomic IBC with the Megablocks architecture. These attributes basically describe the level of integration that can be achieved between chains. However, shared sequencers can only achieve one or two of these attributes. What atomicity can be achieved with heterogeneous Paxos?
    • Block cadence: Shared blocks are likely to take longer than standard blocks. What effect does this have on a chain that is using this atomicity mechanism? Will its blocks come at a slower or irregular rate if they contain cross chain atomic transactions?

Prototype implementation plan

Our initial approach will be to build Megablocks as an ABCI shim instead of modifying Comet directly. This is an elegant approach and will allow for greater modularity. ABCI stands for “Application BlockChain Interface” and is one of the core strengths of the Cosmos stack. ABCI allows for a blockchain application to be completely customized while interacting with the consensus process over a very simple interface. We will attempt to take advantage of this customizability to implement Megablocks.

The Megablocks shim will essentially multiplex the ABCI interface. It will appear to the Comet instance as if it is a single application, while splitting out processing of transactions in a Megablock into several different blocks which are passed to several applications.

Keeping in mind this rough architecture, we will first implement the MVP version, answering the following questions:

  • Megablock format
    • Megablocks will need some format allowing sub-blocks to be split out by the multiplexer. This is theoretically trivial, but it needs to be defined and implemented.
  • Shim architecture
    • We will write the multiplexer binary and connect it in a testbed to two or more applications, as well as writing benchmarking and test transaction generation instrumentation.
  • Apphash combination
    • The simplest way to combine apphashes of different applications is just to hash them together. However, there are many other possible techniques. We will evaluate them and choose the best.

Once this MVP is done, we will be able to benchmark the basic performance characteristics of this architecture, and it will give us a platform to build on. From here we can proceed to explore the following questions, although we will probably go back to the community with another discussion phase proposal before then.

  • Timeouts
    • Basic halt avoidance
      • What’s the simplest way to stop a completely dead application from also halting every other application? There needs to be some way to let them time out.
    • Background work
      • Is it possible to let an application that is bogged down with a lot of extra work for one block (example: Osmosis o’clock) to do the work in the background, let the other applications proceed and then catch a later block?
    • Different block schedules
      • Is it possible to let applications do background work every block, such that they end up having a multiple of the base block time (every 2 blocks or every 4 blocks, etc). This would allow for a fast base block time and slower individual block times to allow for different block times between chains.
  • Apphash errors
    • If one of the applications has an apphash error, how can this be isolated to prevent it affecting consensus as a whole? There are several directions to investigate:
      • Partial state consensus: Is it possible to allow the validators to come to consensus on only a partial part of the state (all of the non-erroring chains)?
      • Fast recovery: Another route would be for apphash errors to be recovered from quickly, by eliminating the erroring chain in the next round of consensus.
  • Light client and full node compatibility
    • Each Megablock contains the sub-blocks of multiple chains. In order for an IBC light client to get the state proofs it needs, it will need to “see into” the Megablock’s apphash. This is theoretically trivial, but needs to be implemented.
    • With the MVP, it will be easy to run a full node that syncs state from all of the chains at once. But it would be very good to be able to run a full node for only one of the chains, for people who are interested in that chain’s state only.
4 Likes

Hey, I loved the illustration.

The multiplexer shim is exactly what I had in mind for this, and love the way it is described.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

interesting proposal. I don’t think Their should be voted for this Project. Or what do you think about this Atomic IBC Team proposal.