operator-sdk Advanced Topics - Leader Election in case of multiple replicas triggered for High Availability purposes

I think it would be good to add, to the Leader election section, the use-case about running several instance of the custom controller workload/pod (i.e., only one being 'active') in order to increase the High Availability characteristics of your operator's controller (so the operator deployer has higher control on how fast a "blocked" currently 'active' controller instance is being replaced by another running (but not yet 'active') instance.

Adding this HA related use-case to the one currently described (related to a Deployment upgrade where new and old pod instance are running in parallel) would better show where this leader election support from Operator SDK can be used.

Oct 20 '23 13:10 antaloala

@antaloala Thanks for raising this issue. Though every operator need not be HA enabled, having documentation on this use case would be helpful. Would you be open to submitting a PR for this?

Oct 30 '23 18:10 varshaprasad96

I am new to open source, can I have a go at this ? I don't exactly know what is going on in the issue, but would love to contribute.

Nov 05 '23 09:11 AHB102

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Feb 04 '24 01:02 openshift-bot

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

Mar 05 '24 08:03 openshift-bot

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Apr 05 '24 00:04 openshift-bot

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen. Mark the issue as fresh by commenting /remove-lifecycle rotten. Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 05 '24 00:04 openshift-ci[bot]