tpalashki comments

Repositories
Issues
Comments

Results 4 comments of


                                            tpalashki

Add API for alert cleanup

If a job stops executing, its last termination status will remain unchanged and equal to the status of its last execution (let's assume it was a User Error). Consequently, the...

Add API for alert cleanup

Good point. This will be helpful, but it will not cover all use cases. For example, jobs that execute rarely (once a month) or jobs that never execute (with schedule...

Add API for alert cleanup

Good idea. We can introduce a configuration similar to Prometheus (`resolve_timeout`) to specify the time after which the alert is automatically resolved.

Improve the robustness of data job deployments

It is not possible to root cause this particular error. It happens very rarely (I have seen it only once) and searching for the above error yields no results. However,...