Add AlertLifeCycleObserver that allows consumers to hook into Alert life cycle
What this pull request does
This pull requests introduces a new AlertLifeCycleObserver interface that is accepted in the API, Dispatcher, and the notification pipeline. This interface contains methods to allow tracking what happens to an alert in alert manager.
Motivation
Currently, when a customer complains “I think my alert is delayed”, we currently have no straightforward way to troubleshoot. At minimum, we should be able to quickly identify if the problem is post-notification (we sent to the receiver on time but the receiver has some delay) or pre-notification.
By introducing a new interface that allows to hook into the alert life cycle, consumers of the alert manager package would be able to implement whatever observability solution works best for them.
This is great! I've been thinking about doing something similar, for the exact reasons mentioned:
when a customer complains “I think my alert is delayed”, we currently have no straightforward way to troubleshoot. At minimum, we should be able to quickly identify if the problem is post-notification (we sent to the receiver on time but the receiver has some delay) or pre-notification.
I'm not 100% sure to understand how it would be used outside of prometheus/alertmanager. Can you share some code? Also though not exactly the same, I wonder if we shouldn't implement tracing inside Alertmanager to provide this visibility about "where's my alert?".
The use that we are thinking of is just adding logs for these events. It sort of becomes an alert history that we can query when the customer comes in. We would like to have the flexibility in implementing how we collect and format the logs and how we will store them.
Just some nits but overall looks good!
@grobinson-grafana @simonpasquier could you have a look at this PR when you have time? Thank you
Rebased PR and fixed conflicts
@simonpasquier this draft PR in cortex gives the general idea of our use case for this feature https://github.com/cortexproject/cortex/pull/5602/commits
@gotjosh good day. Can you take a look at this one?