java-operator-sdk icon indicating copy to clipboard operation
java-operator-sdk copied to clipboard

Leader election

Open shawkins opened this issue 4 years ago • 5 comments

Related to #409 are there plans to add leader election functionality similar to https://docs.openshift.com/container-platform/4.7/operators/operator_sdk/osdk-leader-election.html to the java operator sdk?

shawkins avatar Apr 28 '21 11:04 shawkins

see https://kubernetes.slack.com/archives/CAW0GV7A5/p1643798302258639?thread_ts=1643796946.554699&cid=CAW0GV7A5

csviri avatar Feb 02 '22 10:02 csviri

@shawkins I added this to 3.3 milestone for now.

To summarize it to my understanding there are two cases when running multiple instances of operators is happening and/or desirable, and leader election makes sure only one of them is actively reconciling, thus other than leader instances don't execute reconcilers:

  1. minimize downtime in following cases:
    • There is an updated version of operator being released, and deployment first creates the new version of the operator pod then stops the old one. (For now to handle this scenario use recreate deployment strategy)
    • Minimise downtime of an operator crash, so there are multiple instances running all the time. However there are multiple strategies are this. So if an operator not the leader, should it populate the caches? and just not reconcile the events.
  2. make sure fail-over operators are be provisioned on the cluster, so multiple instance are provisioned, therefore in case the active operator's pod is crashed, it cannot happen that a new instance is not provision because of cluster resources are not available.

In summary there is one design question:

  • Should the non-leader operator instances activate event sources, and just don't trigger reconciliation until elected as leader? Or just basically start the operator when it is elected as leader. Both has pros and cons, when event sources are activated will consume resources (polling in some cases possibly, cache resources in memory), on the other hand will minimize downtime, in case of the syncing the caches on startup takes long time.

csviri avatar Jun 08 '22 12:06 csviri

Maybe the strategy should be configurable i.e. the framework would support event replication but let users activate or deactivate it depending on their needs?

metacosm avatar Jun 08 '22 21:06 metacosm

Maybe the strategy should be configurable i.e. the framework would support event replication but let users activate or deactivate it depending on their needs?

Yes, a feature flag would be nice for that, agree.

csviri avatar Jun 09 '22 11:06 csviri

Just one more note, in both cases, if an operator becomes the leader will need reconcile all the resources anyways. Since there is no info how long the other "leader before" was down.

csviri avatar Jun 09 '22 14:06 csviri