Solution for running active-active validator nodes

Many validators want to run two active-validator nodes for production. We have ran validators without any problems for 1+ month. So I want to share to you guys.

  1. Servers: two validators + one tmkms server.

we did’t use sentries in this test. but you can add it for production.

2 Steps:

2.1 Install HSM server: visit tmkms for more information,

Make sure that your version of tmkms is higher than >= v0.6.3, otherwise you will get double signing.

2.2 setup tmkms.toml:

[[chain]]
id = "kava-testnet-2000"
key_format = { type = "bech32", account_key_prefix = "kavapub", consensus_key_prefix = "kavavalconspub" }

[[validator]]
addr = "tcp://validator-1-ip:26658" 
chain_id = "kava-testnet-2000"
secret_key = "/data/test.key"

[[validator]]
addr = "tcp://validator-2-ip:26658" 
chain_id = "kava-testnet-2000"
secret_key = "/data/test.key"

[[providers.yubihsm]]
adapter = { type = "usb" }
auth = { key = 4, password = "kms-validator-password-1y58g2...." }
keys = [{chain_ids = ["kava-testnet-2000"], key = 11}]

2.3 start tmkms server.

then tmkms server will try to connect to validator nodes.

2.4 edit config.toml of each validator:

# TCP or UNIX socket address for Tendermint to listen on fo
# connections from an external PrivValidator process
priv_validator_laddr = "tcp://0.0.0.0:26658"

2.5 start validator server

if succeed, you will see logs like this:

04:46:58 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://47.101.10.160:26658] signed PreVote: 2FC0C142C5 at h/r/s 34499/0/6 (102 ms)
04:46:59 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://kava-test.ping.pub:26658] signed PreVote:2FC0C142C5 at h/r/s 34499/0/6 (123 ms)
04:46:59 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://kava-test.ping.pub:26658] signed PreCommit: 2FC0C142C5 at h/r/s 34499/0/6 (102 ms)
04:46:59 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://47.101.10.160:26658] signed PreCommit: 2FC0C142C5 at h/r/s 34499/0/6 (199 ms)
04:47:00 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://kava-test.ping.pub:26658] signed PreVote:F4F042F8EB at h/r/s 34499/1/6 (123 ms)
04:47:01 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://47.101.10.160:26658] signed PreVote:F4F042F8EB at h/r/s 34499/1/6 (123 ms)
04:47:01 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://kava-test.ping.pub:26658] signed PreCommit:F4F042F8EB at h/r/s 34499/1/6 (156 ms)
04:47:01 ESC[0mESC[1mESC[34m[info] ESC[0m[kava-testnet-2000@tcp://47.101.10.160:26658] signed PreCommit:F4F042F8EB at h/r/s 34499/1/6 (212ms)

that’s all.

thanks @tarcieri for providing such a good software.

4 Likes

@ping, Thanks for sharing!

does the two validators need to share the same secret key?

A quick note on this: while this functionality is available in the latest releases of KMS, we’re still worried about validators running this configuration in a steady state. See this blog post on the topic:

Notable excerpt:

For this reason we recommend validators don’t run this sort of configuration in perpetuity, but use the functionality to failover between validators. Based on this incident, and others we’ll describe below, we in fact recommend you only run in this configuration on testnets for now because we think this operational mode needs a lot more testing to be safe .

1 Like

I don’t think secret key must be same.