Leader election
Related to #409 are there plans to add leader election functionality similar to https://docs.openshift.com/container-platform/4.7/operators/operator_sdk/osdk-leader-election.html to the java operator sdk?
see https://kubernetes.slack.com/archives/CAW0GV7A5/p1643798302258639?thread_ts=1643796946.554699&cid=CAW0GV7A5
@shawkins I added this to 3.3 milestone for now.
To summarize it to my understanding there are two cases when running multiple instances of operators is happening and/or desirable, and leader election makes sure only one of them is actively reconciling, thus other than leader instances don't execute reconcilers:
- minimize downtime in following cases:
- There is an updated version of operator being released, and deployment first creates the new version of the operator pod then stops the old one. (For now to handle this scenario use
recreatedeployment strategy) - Minimise downtime of an operator crash, so there are multiple instances running all the time. However there are multiple strategies are this. So if an operator not the leader, should it populate the caches? and just not reconcile the events.
- There is an updated version of operator being released, and deployment first creates the new version of the operator pod then stops the old one. (For now to handle this scenario use
- make sure fail-over operators are be provisioned on the cluster, so multiple instance are provisioned, therefore in case the active operator's pod is crashed, it cannot happen that a new instance is not provision because of cluster resources are not available.
In summary there is one design question:
- Should the non-leader operator instances activate event sources, and just don't trigger reconciliation until elected as leader? Or just basically start the operator when it is elected as leader. Both has pros and cons, when event sources are activated will consume resources (polling in some cases possibly, cache resources in memory), on the other hand will minimize downtime, in case of the syncing the caches on startup takes long time.
Maybe the strategy should be configurable i.e. the framework would support event replication but let users activate or deactivate it depending on their needs?
Maybe the strategy should be configurable i.e. the framework would support event replication but let users activate or deactivate it depending on their needs?
Yes, a feature flag would be nice for that, agree.
Just one more note, in both cases, if an operator becomes the leader will need reconcile all the resources anyways. Since there is no info how long the other "leader before" was down.