[ANN] Tendermint KMS v0.2: Validator Signing Support

Tendermint Key Management System (KMS) , a.k.a. tmkms , is a signature service backed by Hardware Security Modules (HSMs), namely YubiHSM2 devices from Yubico, and soon, the Ledger Nano S . It’s intended to be run alongside Cosmos Validators, ideally on separate physical hosts, providing defense-in-depth for online validator signing keys as well as a central signing service that can be used when operating multiple validators in several Cosmos Zones.

This release (and the previous, unannounced v0.1 release) contain initial support for using tmkms as a priv_validator backend, providing full end-to-end support for storing consensus keys in a YubiHSM2 device and using them on the upcoming gaia-9002 and Game of Stakes testnets.

Note that the code is presently alpha quality and this will be the first time it is (potentially) usable on a live testnet. Expect crashes, bugs, and protocol changes for the time being.

Installation

For detailed install instructions, please see:

Short list:

  1. Install Rust: https://rustup.rs/
  2. Install tmkms: cargo install tmkms
  3. Linux: Configure udev

Creating YubiHSM validator key

Please see tmkms v0.0.1 release notes for details on creating a validator key.

Configure gaiad to accept tmkms connections

tmkms acts as a TCP client for gaiad. An approximate network diagram:

[tmkms] -> [validator gaiad] -> [sentry gaiad] -> [cosmos p2p]

To configure gaiad to accept connections from tmkms, use the newly added priv_validator_laddr configuration option in ~/.gaiad/config/config.toml:

priv_validator_laddr = "tcp://10.11.12.13:26657"

Configuring tmkms

Start with tmkms.toml.example which contains an example KMS configuration:

# Example KMS configuration file
#
# Copy this to 'kms.toml' and edit for your own purposes

[[validator]]
addr = "tcp://example1.example.com:26658" # or "unix:///path/to/socket"
chain_id = "gaia-9000"
reconnect = true # true is the default
secret_key = "path/to/secret_connection.key"

[[providers.yubihsm]]
adapter = { type = "usb" }
auth = { key = 1, password = "password" } # Default YubiHSM admin credentials. Change ASAP!
keys = [{ id = "gaia-9000", key = 1 }]
#serial_number = "0123456789" # identify serial number of a specific YubiHSM to connect to

The addr field of [validator] contains the address of the validator. Typically, this will match priv_validator_laddr from the gaiad configuration.

Launching tmkms

Launch tmkms with tmkms start, which accepts an optional -c parameter pointing to the configuration:

tmkms -c ~/.tmkms/tmkms.toml

This will launch tmkms, connect to gaiad (if running), and begin providing the signature service

Help/Support

Having trouble setting up tmkms? Please post on this thread, or in the Cosmos Validators Riot channel.

Thanks!

4 Likes

I’m having trouble building v0.2.0. Apologies if this resulting form newb errors, rust is new to me.

error[E0658]: `crate` in paths is experimental (see issue #45477)
   --> /home/mattharrop/.cargo/registry/src/github.com-1ecc6299db9ec823/ring-0.13.5/src/test.rs:463:9
    |
463 |     use crate::{error, polyfill, private, rand};
    |         ^^^^^

error: aborting due to 71 previous errors

For more information about this error, try `rustc --explain E0658`.
error: Could not compile `ring`.
warning: build failed, waiting for other jobs to finish...
error: failed to compile `tmkms v0.2.0`, intermediate artifacts can be found at `/tmp/cargo-installetzhiM`

Caused by:
  build failed

I tried replacing my rust install with the nightly build, no difference.

Any ideas?

Thanks,
matt

Looks like one of the dependencies was recently updated to require rust 1.30. However, nightly should also work, if it’s new enough.

What does rustc --version say?

I was on 1.29.2
This is the version that was installed with curl https://sh.rustup.rs -sSf | sh, today.

I have updated to 1.30.1 using rustup update stable and problem solved.

Thanks for the tip!

Hello. How does one configure secret_key = "path/to/secret_connection.key" and
path = "path/to/signing.key" ?
I also get this error while doing a test

tmkms yubihsm keys test 1

yubihsm: unrecognized command `test`

This will get autocreated when it doesn’t exist (we should probably add a full tmkms init command ala gaiad though).

Regarding this whole section of the config:

[[providers.softsign]]
id = "gaia-8000"
path = "path/to/signing.key"

This is a testing-only software backed signer in the event you want to test tmkms but don’t have a YubiHSM2. If you do, you can simply delete this whole section of the config.

If you do actually want to generate a software-backed key though, the command is tmkms generate path/to/signing.key

The command to run a test is: tmkms yubihsm test 1 (you had an extra keys there)

1 Like

Thanks! Everything works well.

This will get autocreated when it doesn’t exist

@iqlusion I am setting this to test with a local testnet. Where the secret key will be created?

Thanks!

The secret key will be created at the path you specify, provided the parent directories exist.

2 Likes

Ar… I see! Thank you! That’s very clever.

Got it working, thanks for the instructions!

Just want to be sure that it is intentional that the secret_connection file contains all zeroes?

xxd /opt/tmkms/config/secret_connection.key
00000000: 0000 0000 0000 0000 0000 0000 0000 0000 …
00000010: 0000 0000 0000 0000 0000 0000 0000 0000 …

BR,
Martin

That is definitely NOT intentional, and is a high-severity bug. I have opened an issue here:

https://github.com/tendermint/kms/issues/118

Yes I get the same.

BTW, what capabilities are required for tmkms? I want to create an authkey which only have enough capabilities for gaiad.

I have reproduced this problem on Linux. The problem does not seem to occur on macOS. I am continuing to investigate.

The only capability presently needed is asymmetric_sign_eddsa, however this may change as tmkms evolves.

I plan on writing a (customizable) provisioning tool that can generate a multi-level account hierarchy including administrative, operational, auditing, and application roles. When that happens, it will consider the full set of capabilities needed for each role.

1 Like

I’ve put out a small point release, v0.2.1, which should hopefully address all known bugs, including one for another feature in the original release which was buggy so I didn’t announce it:

priv_validator.json import support

If you’ve already registered a key in the genesis.json for an upcoming testnet, such as gaia-9002, you can import it into a YubiHSM2 using the new tmkms yubihsm keys import command:

$ tmkms yubihsm keys import --path ./priv_validator.json 9002

This will import the Ed25519 key from the ./priv_validator.json file into slot 9002 of the configured YubiHSM2 (you will need to have tmkms.toml configured in advance).

Secret Connection key mini-post-mortem

A quick aside: the format for SecretConnection keys (i.e. secret_key under the [[validator]] section) is now Base64. This means any previously generated keys will not load and are unusable, but that’s ok because any keys generated by a release build were all zeroes! If you see an error like this, please delete the offending key to ensure it’s regenerated:

config error: error loading SecretConnection key from tmkms.key: invalid encoding

The root cause of the issue was a bug in the subtle-encoding library used to provide constant time encoding/decoding of secret keys. This library contained code gated on a debug assertion (i.e. debug_assert_eq!) which performed the encoding/decoding operations was compiled out of release builds:

https://github.com/iqlusioninc/crates/pull/126

The bug has been fixed, and the CI configuration for this repository has been updated to run tests in release mode. I have confirmed that running the tests in release mode would’ve caught this particular bug:

https://travis-ci.org/iqlusioninc/crates/jobs/459482729

Quick status report on attempting to use tmkms on gaia-9002:

We had to squash a few bugs around things like key encodings, but we managed to have tmkms live for a little bit.

We encountered a bug which appears to be a blocker for using tmkms on a testnet for the time being:

https://github.com/tendermint/kms/issues/130

It looks like it shouldn’t be difficult to resolve though. Hopefully we can have another point release out soon, along with instructions for how to enroll a tmkms-backed validator into a testnet.

I’ve released tmkms v0.2.2 which, to my knowledge, addresses all remaining known bugs.

We’ve been using it live on gaia-9002 and have so far it has produced over 30,000 signatures.

2 Likes

Tony, what should be the settings for both gaiad and tmkms if both of them are running on the same machine?

There are a couple options for localhost usage. When configuring tmkms.toml:

[[validator]]
addr = [...] 

You can use either TCP or a Unix domain socket to talk to gaiad running on the same host:

  • Pick a port (e.g. 12345) and use tcp://127.0.0.1:12345 or tcp://localhost:12345. This will use local TCP/IP loopback
  • Use a Unix domain socket: unix:///path/to/socket

When configuring gaiad’s ~/.gaiad/config/config.toml, make sure to use the same value for priv_validator_laddr