Monitoring/Alerting for your Validator

You mean the Grafana dashboard? It should work with any network your node is running, as long as the connection to Prometheus doesn’t get changed. So no need to update it :slight_smile: I haven’t tried it in 7004 yet, though.

We have updated it and I’m testing with it now. I keep receiving absent validator notification if our validator node didn’t send vote to a certain height. You can add the bot and subscribe to your validator address to try.

2 Likes

Where is prometheus.yml?

Thanks for your work on this! I’m trying to configure the dashboard to monitor one of my validators. Could you please provide some guidance on the required settings for the Prometheus data source, show in the screenshot?

The URL in the HTTP section needs to be configured to the HTTP API URL of the Prometheus server that is scraping your validator input.

The Grafana documentation on this topic might be helpful as well.

Can you share your prometheus.yml so that we can understand the setup better

hey, try switchin in data sources in grafana from server to browser.

Because I’ve been asked this a lot, I provided a smat step-by-step instruction to setup Grafana with my dashboard. Hope this works and will help people to get started.

  1. Step: Install Grafana (http://docs.grafana.org/installation/debian/) & start it

  2. Step: In .gaiad/config.toml set prometheus=true

  3. Step: Restart gaiad to apply config changes

  4. Step: Download prometheus (https://prometheus.io/docs/introduction/first_steps/), edit prometheus.yml
    Add the following:

       # COSMOS MONITORING
       # The job name is added as a label `job=<job_name>` to any timeseries scraped$
       - job_name: 'cosmops'
    
       	# metrics_path defaults to '/metrics'
       	# scheme defaults to 'http'.
    
     	static_configs:
     	- targets: ['localhost:26660']
     		labels:
     			group: 'cosmops'
    
  5. Step: start prometheus with: ./prometheus --config.file=prometheus.yml

  6. Step: Open Grafana in Browser & Do initial Setup

  7. Step: Under Configuration -> Data Source -> Add a new Data Source

Name: CosmosDataSource
Type: Prometheus
URL: http://localhost:9090
Scrape Interval: 5s
Rest is Default

-> Save&Test should add DataSource

  1. Step: In Grafana goto Dashboard -> Import
  2. Step: Paste 7044 (this is my Dashboard template for Grafana), Choose “CosmosDataSource” as Data Source
  3. Step: You should now have a working Dashboard :slight_smile:
6 Likes

I’m working on some monitoring and alerting for validators and sentries -

1 - Using Icinga for alerts

2 - Updating a Grafana/Prometheus dashboard

3 - Log analysis

I plan to open source the tools when they’re ready.

For starters, I’m wondering if anyone has done any research into log patterns that indicate missed pre-commits?

1 Like

Adding feedback from -

@mattharrop If set to do so, gaiad will write every signature in each block to syslog. Just check for your validator’s ID in the block of signature, if it’s not there, that’s a miss.

@haasted We’ve created a tool to monitor for pre-votes. https://github.com/validator-network/votewatcher Feedback welcome

2 Likes

In our old gaiabot, it utilizes a systemd package to monitor the journal. If you run your gaiad as systemd service, then the journal can be received from it. You may take a look.

However, I don’t quite like this approach as it uses a lot of resources to keep checking every line of journal log of the process to decide if it should send out an alert message.

1 Like

Feedback from @jack https://twitter.com/jack_zampolin/status/1115987603243683841

A Grafana dashboard compatible with all the cosmos-sdk and tendermint based blockchains: https://github.com/zhangyelong/cosmos-dashboard


1 Like

This post of mine is almost 2 years old. I’m not keeping this up-to-date anymore. Please find some more recent information.