Redundancy of Validator Server (Physical Datacenter)
Security and maintenance have been mentioned multiple times among validators, since it plays a critical role in safe operation of validator nodes within the network.
While considering multiple security structures, such as the article “Sentry Node Architecture”, our team has come to an issue of making a backup for Validator server as one of crucial issue.
Documentation of the Cosmos Network recommends that validators keep their Validator nodes located within a local datacenter, while operating Sentry Nodes in cloud environment such as AWS or GCP.
But even in a well-managed datacenter, there can be several unexpected issues that will bring the validator node down:
Power of data center goes down
Numerous reasons that can possibly affect the healthy operation of a validator node.
Thus, we have come to a new idea to prevent the above problems through setting the validator node by the procedure below:
Connect NFS to two servers
Set up validator in the connected NFS
Create two identical accounts in each of the two servers
Run validator on one of the newly created accounts within a server
If one server get shutdown, or goes down for any reason, we conveniently use the other, but identical server to run validator again. Since both servers are connected to 1 network storage, there should only be one block data, thus, preventing the issue of double-signing.
This method allows following up with the current block height, but the network connection speed is extremely slow. If SAN (Storate-Area Network) method was used instead of NFS, this issue should be cleared, but the cost of SAN is extremely high and won’t be effective when evaluated in multiple point of views.
Knowing that connection speed is a problem, it would still be very helpful to know what other validators think of backing up validator server with this method. Or if any other effective method can be adopted, please feel free to share!