More explanation on Altinity Clickhouse Operator Grafana Dashboard
Hi, there. Altinity is super powerful and convinient to be used. Just a triflling advice here, could you guys giving some more details or more description on prometheus-monitoring parameter(eg: chi_clickhouse_event_ZooKeeperUserExceptions; chi_clickhouse_event_InsertQuery etc...), cuz sometimes I found it is a bit confusing to me to understand the meaning behind those numbers. Best Regards!
look detailed descriptions in the following queries
SELECT * FROM system.events
SELECT * FROM system.metrics
SELECT * FROM system.asynchronous_metrics
for example event_ZooKeeperUserExections SELECT * FROM system.events WHERE event ILIKE '%ZooKeeperUserExections%' SETTINGS system_events_show_zero_values=1;
if the description field is empty (funny, like this case ;)
You need to try to figure out with clickhouse source follow https://github.com/ClickHouse/ClickHouse/search?q=ZooKeeperUserExceptions and look to https://github.com/ClickHouse/ClickHouse/blob/07afb42e79793ceca025a4c01c70c3408c8dfa9d/src/Common/ZooKeeper/IKeeper.cpp#L28
UserExceptions mean error codes in following corner cases https://github.com/ClickHouse/ClickHouse/blob/07afb42e79793ceca025a4c01c70c3408c8dfa9d/src/Common/ZooKeeper/IKeeper.cpp#L128-L135
It's not fatal error, only Zookeeper Hardware errors need your attention
some of the metrics in our dashboard collect by custom queries via our metrics-exporter operator look details in https://github.com/Altinity/clickhouse-operator/blob/master/pkg/apis/metrics/clickhouse_fetcher.go#L27-L138