[Downfall.fail] Shared Tenancy Considered Harmful

Downfall

note and call to action

The first thing I need to say is that Robin from our team is going to have a more detailed write up on this topic.

I will be submitting this forum post and Robin’s forum post to the ICF and to AIB in the hopes that both organizations will delegate only to validators who own their own machines from here forward.

it is entirely unsafe to operate a validator on any shared tenancy machine (AWS, gce, and any cloud virtual machine)

This has probably been true since the rowhammer attacks, but it seems that downfall makes clear what many of us have long suspected. Everything about cloud scale virtualization and shared tenancy computing completely inappropriate for the blockchain space.

You can find more information at

advice for large delegators

who is a large delegator?

  • Liquid staking protocols
  • Funds
  • Family offices
  • Individual large holders of atoms

In a proof of state network it is folks like yourselves who secure and govern the network. Somebody like me is only able to bring awareness.

I would like to call on our community’s whales to protect the network by migrating stake.

Sometime today I should have produced a short list of validators that notional delegates to ourselves and has specifically white listed for this issue, the whitelisting process will consist of a phone call and gathering data about systems. We can’t be certain, but personally I think it is much better than accepting known security problems.

secret network is dead (again)

Please don’t trust it, SGX is a fundamental flaw. Run. The end.

For more information please see:

Jacob you’re so negative and driving people out of cosmos

Actually in this case, I believe that this is true. This is not happy news, it is true news. I think that validators incapable of running their own equipment should leave every cosmos network.

I think that validators who insist upon shared tenancy or are unwilling to attest to using equipment that they own, or are non-communicative, should leave as well.

I will probably be making a governance proposal out of this but I want to make very clear that it’s not something that we can take coercive action on. I think that this governance proposal will be in an advisory format. And what we will be looking for is for the community to approve the advisory.

what about the microcode update?

You don’t know if it’s been applied.

Also right now it hasn’t been released. Downfall is the latest in a series of very serious problems that have to do with the processes that we created for multi-tenancy in cloud systems. These processes have time and time again shown to be unsafe.

We did not come here to be some weak ass proof of stake. We came here to win. If you want weak ass proof of stake, go somewhere else where people don’t discuss fundamental security problems and attempt to take action.

The rest of us are going to work on winning.

cloud remote signers

If you have an MPC horcrux system, but all signers are in the cloud, then you’re vulnerable to downfall.

If you have a tmkms system and your signer is in the cloud, you’re vulnerable to downfall.

help! I am vulnerable

Firstly, you’re likely should not say that you are vulnerable publicly. The reason for that is not to protect your delegations, but instead to protect your keys. If an attacker could determine where you host, they might be able to get onto the same machine as you and therefore get at your keys.

Proposal 104 on the cosmos hub funded notional to do security work for the cosmos hub and I believe that helping validators mitigate downfall Is in scope for that funding.

So if you are vulnerable, and you would like assistance moving to exclusively single tenancy equipment, please contact me at Twitter.com/gadikian

credits

… Credit to any whale who makes a major redelegation pending.

1 Like

I’m Jacob Gadikian, CEO at Notional and this post is my attestation that we use no shared tenancy machines for validation and are therefore not at risk from the downfall attack.

Delegators who would like more information about our systems are encouraged to reach out on Twitter to myself directly at twitter.com/gadikian

The majority of our systems are hosted at our office in Hanoi Vietnam, and and we have a small number of systems hosted at hetzner in Germany and Finland exclusively on single tenant machines.

All of our cosmos hub systems are hosted in Hanoi on equipment that is in our physical custody.

1 Like

I am founder of Architect Nodes and happy to see this discussion on Cosmos forum. Architect Nodes is primarily using self owned Bare Metal servers in our possession for validator nodes. We also use rented single tenancy Bare Metal servers for redundancy with different providers that are using remote signers with yubihsm in our possession.

Security is important and we should encourage safer practices when it comes to running validator nodes :saluting_face:

1 Like

I am Tricky, Co-Founder of Cosmos Spaces. The typical set up for a chain that we validate consist of Horcrux clusters running on a mix of self hosted and dedicated bare-metal servers. The machines are spread across different locations in order to guarantee the most redundancy and security as possible.

We’re always working to ensure we have a top-tier operation for our delegators.

2 Likes

So let me just clarify one thing, every single machine that you have cryptographic material on is exclusively single tenancy?

Hello, everyone.

Ryan (aka Phunky) here from Lucky Friday Labs, LLC. I have verified with our tech team that we only use “single tenancy” on our nodes. All of our validators (and RPCs, etc) are run on fully owned bare metal servers and are under lock and key in SOC2 compliant Tier 4 and Tier 5 data centers.

The only two members of our team who have physical access to the server racks locked within these cages are gentlemen with over 30 years of data center experience, one of whom used to work for the US Department of Defense and the other used to do cybersecurity work for the FBI.

Rest assured, we take security very seriously and are incredibly cautious in our set up and maintenance of all nodes. We have recently signed lease agreements for more SOC2 data center space in Phoenix, AZ and in Amsterdam, and we hope to have both of these live with new servers by the end of Q4. This will put us on four continents, I believe, and we have aims to be on every continent at some point in 2024.

P.S. - We are currently in the process of our own internal SOC2 audit for our team and its practices, and once we achieve this certification we will begin the process of applying for the more specific and stringent CCSS audit.

How does this relate to validators running multiple Cosmos chains on one bare metal server? Are the same risks there, but in the context of one chain messing with others?

1 Like

Hey I’m sorry – it doesn’t relate to that.

The risk is specifically to teams that are running on virtual machines that are shared with other organizations purchased from a cloud service.

Having multiple chains on a single bare metal machine is in my opinion equally safe as it is today… Except for what you just mentioned, one chain messing with the others.

My opinion here is that all of this really just underscores the need for validator teams to be vigilant.

Code review remains important, and this attack is basically terrifying. Personally I haven’t believed in the security of the cloud since rowhammer, and that was rather a long time ago now.

Hi,

I’m Keefer, founder of Tessellated. We use single tenancy bare metal servers to run our nodes, and our signing machines are located on hardware we own, backed by YubiHSM2s.

1 Like

Awesome! I’ll add you to the list

I’m quite curious if a server can be compromised if it’s a self-hosted cloud. Like, in our case, we use Proxmox internally to split bigger services into a smaller containers/virtual machines, but these servers are only used by us and nobody else. I assume we should be safe, as we don’t have other people using the same underlying host as we do, but please let me know if I am wrong here.

We use 3 kinds of servers:

  1. Proxmox instances hosted at home and split into containers - used for all validators
  2. Hetzner virtual machines - for some public nodes
  3. Hetzner dedicated servers - for other public nodes

Additionally, we do not store keys on validators themselves, we use TMKMS + YubiHSM2 for signing blocks on all validators, and it’s not hosted on VM at all, it’s a separate server.

1 Like

I believe that this means that you’re okay but I am beginning to think that we should ask someone like in a formal manner.

This is no normal security exploit, it happens at the hardware level and so first of all it’s very hard to fix and secondly I think that it can imply things that probably haven’t been realized yet.

If you are the only user of those virtual machines that you described, and you own them then probably okay. Except they maybe can read stuff from each others ram, so that is super not ideal…

My biggest concern here is that in such setup, if one server is compromised, the others are as well. But to be honest, if someone can run an arbitrary code on your server, you have way bigger issues, and downfall is not the biggest issue here.

If you are the only user of those virtual machines that you described, and you own them then probably okay.

That’s pretty much it for us. Also I guess for us having a separate layer for HSM is also nice, as we do not store keys on servers, so basically there’s not much to steal.

Meanwhile, reading the news I realized that the patched Linux kernel is either in the works, or already have been released, so I think it’s worth it for anybody (doesn’t matter if it’s affected or not) to check whether your distro has released a patch and update, if yes. At least that’s my plan, we have a monitoring system that’ll alert us if there are upgrades we have not installed yet, and I highly recommend everybody to do the same, it really helps making your setup more secure.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.