MongoDB Health System Notification
What?
- Can we get system notification or errors, in the UI, when MongoDB(single or multi-node) is down or in Recovering.
- Similar to the OpenSearch cluster health notifications.
- Or possibly something on the Node page.
Why?
-
Recently had a two case where customers Archiving was failing because one of their MongoDB nodes was in RECOVERING mode.
-
They had no idea that any of MongoDB nodes were having issues, and their archives were failing for several months.
-
If they had errors or notifications in the Graylog UI telling them their MongoDB nodes were unhealthy, or down, we may have been able to avoid the archiving issues.
Your Environment
- Graylog Version: 6.0.5
- MongoDB Version: 5.0.21
- Operating System: Ubuntu
The environment I have the most detail on has 1 load balancer, 3 Graylog nodes, 3 OpenSearch nodes. The three Graylog nodes are also running MongoDB, and replication is configured.
Just as a FYI for whoever-- to catch archival issues we (not one of the customers mentioned) have an event definition in place for message:"ARCHIVING_SUMMARY: Indices could not be archived yet" on the All system events stream.