case study
- one of sentry was dead for more than 20min
- dead sentry became alive after dead period
- a relay has the sentry as a persistent peer
- relay node config : pex/seedmode false
tested result
- relay never retry after the dead sentry alive
- rpc dial_peer from relay throw error : Permanently Removed
- only way to re-establish connection is to restart the relay node, or rpc-dial relay node from sentry
problem and solution
- although sentry had been offline for some time, relay should try reconnect the sentry at least every 1 minute. no harm to do it. also it is fare based on meaning of PERSISTENT. 1 minute can be configured in config.toml.
- gaiad should never prohibit users to manually dial a peer via rpc endpoint. human does that when he has enough good reason to do it.
opinion
- it is quite a urgent thing to fix because it affects a validatorâs and the whole networkâs connection stability.
- especially, prohibiting human manual rpc-dialing is malfunctioned in my opinion.