WIP: next-gen-prowjob-dispatcher: service holding information about current prowjob/cluster assigment
Dispatcher reworked.
assumptions:
- nothing in the location other then
cmd(so pkg, libs) will not be changed. This is because I do not want to break current dispatcher in the transition phase. - therefore I am using existing configs to achieve the goal, which is less efficient (due to garbage) then introducing new ones
The goal is to create a service which will be query-able by clients (main one: scheduler plugin, but also cluster-bot) to provide static assignment job with the given parameters to cluster.
Service has 2 new configs:
- job assignment config which should land on PVC
- cluster config which resides in
o/release.
Additionally, it still reuses old dispatcher config which is becoming read-only. Service reacts on [2] config changes and adds/removes cluster to job assignments from [1] config.
Simplified mode of operation of the service:
- upon start it reads all 3 configs
- it initializes map with the current assigment for REST communication
- upon edition of cluster config it queries Prometheus to get job volumes, they are kept in cache for 24h or until restart to limit Prometheus communication
- it works like the old dispatcher and weekly reshuffles config taking into consideration Prometheus data
- it writes job configuration to separate config which should be stored on PVC
Query (request/response)
$ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"organization":"openshift","repository":"ci-tools","type":"presubmit","branch":"master"}'
{"cluster":"build05"}
Follow-up in clients
Clients should implement short-term cache to avoid firing the same requests multiple time (example implementation can be found in needs-rebase plugin).
One more thing to add is not to relay on curl/http requests if the new cluster config is provided. I will add that later.
and after that change I most likely will be able to remove special clusters as well.
/retest
/test remaining-required
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: jmguzik, Prucek
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [Prucek,jmguzik]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/retest-required
Remaining retests: 0 against base HEAD b5318adfcb4f80709fdd4f73dc4d89e98ae8ee7d and 2 for PR HEAD 40b2359c1b55b89deafc6772cb55a80d70d53038 in total
@jmguzik: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/security | 40b2359c1b55b89deafc6772cb55a80d70d53038 | link | false | /test security |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
/retest-required
Remaining retests: 0 against base HEAD e6d9c196f64fb27a8817053270d21cb72d02398c and 1 for PR HEAD 40b2359c1b55b89deafc6772cb55a80d70d53038 in total
/retest
sth related to arm