configuration-anomaly-detection
configuration-anomaly-detection copied to clipboard
Configuration anomaly detection for OSD clusters
https://issues.redhat.com/browse/OSD-23312 This PR showcases how CAD could use kube api with an example implementation of the remediation to https://issues.redhat.com/browse/OCPBUGS-33863. The implementation is hidden behind a `CAD_EXPERIMENTAL_ENABLED` flag that we will...
This is a PoC as part of RDA7: this PR adds a custom interceptor that allows routing all alerts to CAD without starting a new pipeline for each alert. To...
Adds a deployment for the interceptor. To be merged after https://github.com/openshift/configuration-anomaly-detection/pull/280.
Bumps [github.com/openshift/osd-network-verifier](https://github.com/openshift/osd-network-verifier) from 0.4.11 to 1.0.0. Release notes Sourced from github.com/openshift/osd-network-verifier's releases. v1.0.0 What's Changed copy URL lists from golden-ami repo by @eth1030 in openshift/osd-network-verifier#238 Add version info on build...
We do no longer post LimitedSupport if the target cluster is from a organization with the managed_critical_customer capability. Instead we redirect SRE to take additional steps for those clusters. See...
This incorporates version `v1.1.2` of `osd-network-verifier`, which includes a breaking change to the `ValidateEgressInput` struct type. This additionally includes an update to fetching the machine image ID to no longer...
Bumps [github.com/openshift/osd-network-verifier](https://github.com/openshift/osd-network-verifier) from 0.4.11 to 1.1.2. Release notes Sourced from github.com/openshift/osd-network-verifier's releases. v1.1.2 What's Changed [OSD-25868](https://issues.redhat.com//browse/OSD-25868): Adds aws-hosted-cp as a valid Platform name by @joshbranham in openshift/osd-network-verifier#275 New Contributors @joshbranham...
E2E Test: ClusterMonitoringErrorBudgetBurn Alert Trigger and Recovery ([OSD-30030](https://issues.redhat.com//browse/OSD-30030)) Description: This PR adds an E2E test for the ClusterMonitoringErrorBudgetBurn alert, targeting AWS CCS clusters. The test misconfigures the user-workload-monitoring-config ConfigMap in...
[OSD-18645](https://issues.redhat.com/browse/OSD-18645) - CAD implementation for CannotRetrieveUpdatesSRE Sample ticket: https://redhat.pagerduty.com/incidents/Q1S45W54TK1QKU#:~:text=%E2%9A%A0%EF%B8%8F%20ClusterVersion%20error%20detected,primary%20for%20review Updated sample ticket: https://redhat.pagerduty.com/incidents/Q2UVRI8YGLPP3G Updated sample ticket: https://redhat.pagerduty.com/incidents/Q01H4ZQHM90EML
Adds a way to test cadctl remediations without pushing the metadata.yaml file to the main branch of the configuration-anomaly-detection repository. Tested on Fedora. Testing steps: Create a staging cluster. Follow...