Use one proxy per VM
Modify runtime to launch one proxy per VM. This will improve our HA story since there will no longer be a single point of failure on the system (the system-level instance of cc-proxy).
Plan (ordered steps)
- [x] Update the proxy to allow 1 proxy / pod, alongside the system-level systemd
cc-proxyservice (https://github.com/clearcontainers/proxy/pull/167). At this point:- multiple instances of
cc-proxywill be able to co-exist with the system-levelcc-proxysystemd service.
- multiple instances of
- [x] Update virtcontainers to spawn 1 proxy / pod (https://github.com/containers/virtcontainers/pull/483).
- [x] Re-vendor the virtcontainers changes into the runtime (#835).
At this point:
- the system-level
cc-proxysystemd instance won't be being used (although it can continue to run without interfering with the pod-specificcc-proxyinstances). - we will lose the KSM feature of
cc-proxybecause since all the proxy requests will be handled by the pod-specificcc-proxyinstances and since we don't want multiple proxies fighting over KSM kernel settings, the KSM code will never be run.
- the system-level
- [x] Create a daemon to replace the KSM functionality in
cc-proxy(https://github.com/clearcontainers/proxy/issues/168) which is now implemented by https://github.com/kata-containers/ksm-throttler. - [ ] Remove the following features from the proxy:
- KSM.
- socket activation.
- handling code for >1 pods. (https://github.com/clearcontainers/proxy/pull/177 for KSM + socket activation).
- [ ] Update packaging for the proxy (https://github.com/clearcontainers/packaging/issues/198) to do something like:
$ sudo systemctl stop cc-proxy.socket $ sudo systemctl stop cc-proxy.service $ sudo rm /lib/systemd/system/cc-proxy.socket $ sudo rm /lib/systemd/system/cc-proxy.service $ sudo systemctl daemon-reload - [ ] Release updated packages for proxy, runtime and the new KSM daemon.
At this point, there will no longer be a system-level
cc-proxyinstance.
/cc @sameo, @grahamwhaley, @sboeuf, @dvoytik.
Hi @jodh-intel,
Overall IMHO this makes sense.
Btw, I guess https://github.com/clearcontainers/proxy/pull/107 still could be useful with the above plan. Although the PR should be refactored a bit.
EDIT: orthography
@dvoytik yes this would still make sense, but with a lower priority. Having one proxy per VM would solve most of our issues. In case of a proxy crash we would only loose one pod.