Modify runtime to launch one proxy per VM. This will improve our HA story since there will no longer be a single point of failure on the system (the system-level instance of cc-proxy).

Plan (ordered steps)

[x] Update the proxy to allow 1 proxy / pod, alongside the system-level systemd cc-proxy service (https://github.com/clearcontainers/proxy/pull/167). At this point:
- multiple instances of cc-proxy will be able to co-exist with the system-level cc-proxy systemd service.
[x] Update virtcontainers to spawn 1 proxy / pod (https://github.com/containers/virtcontainers/pull/483).
[x] Re-vendor the virtcontainers changes into the runtime (#835). At this point:
- the system-level cc-proxy systemd instance won't be being used (although it can continue to run without interfering with the pod-specific cc-proxy instances).
- we will lose the KSM feature of cc-proxy because since all the proxy requests will be handled by the pod-specific cc-proxy instances and since we don't want multiple proxies fighting over KSM kernel settings, the KSM code will never be run.
[x] Create a daemon to replace the KSM functionality in cc-proxy (https://github.com/clearcontainers/proxy/issues/168) which is now implemented by https://github.com/kata-containers/ksm-throttler.
[ ] Remove the following features from the proxy:
- KSM.
- socket activation.
- handling code for >1 pods. (https://github.com/clearcontainers/proxy/pull/177 for KSM + socket activation).

[ ] Update packaging for the proxy (https://github.com/clearcontainers/packaging/issues/198) to do something like:

 $ sudo systemctl stop cc-proxy.socket
 $ sudo systemctl stop cc-proxy.service
 $ sudo rm /lib/systemd/system/cc-proxy.socket
 $ sudo rm /lib/systemd/system/cc-proxy.service
 $ sudo systemctl daemon-reload

[ ] Release updated packages for proxy, runtime and the new KSM daemon. At this point, there will no longer be a system-level cc-proxy instance.

Nov 07 '17 14:11 jodh-intel

/cc @sameo, @grahamwhaley, @sboeuf, @dvoytik.

Nov 13 '17 14:11 jodh-intel

Hi @jodh-intel,

Overall IMHO this makes sense.

Btw, I guess https://github.com/clearcontainers/proxy/pull/107 still could be useful with the above plan. Although the PR should be refactored a bit.

EDIT: orthography

Nov 13 '17 14:11 dvoytik

@dvoytik yes this would still make sense, but with a lower priority. Having one proxy per VM would solve most of our issues. In case of a proxy crash we would only loose one pod.

Nov 13 '17 15:11 sboeuf

Use one proxy per VM

Plan (ordered steps)