pekko-management icon indicating copy to clipboard operation
pekko-management copied to clipboard

Rolling update support for k8s 1.22 ReplicaSets

Open lomigmegard opened this issue 10 months ago • 6 comments

Since Kubernetes version 1.22, the ReplicaSets are not scaled down with the youngest node first.

The issue was already raised in akka-management, there are more details there: https://github.com/akka/akka-management/issues/1130.

The issue is that we don't want Singletons to be moved more than necessary during a rolling update.

lomigmegard avatar Apr 15 '25 09:04 lomigmegard

Thanks @lomigmegard - this also affects scaling down when pod clusters have their pod counts reduced.

We can't copy the Akka change. If someone has time to produce an equivalent change that would be useful. Singletons will move to the next oldest cluster member if the oldest one is stopped. I wonder if it would also be possible to set the pod deletion cost based on knowledge of whether a Pekko cluster member has singletons deployed on it.

Our docs don't exactly encourage the use of singletons. Users should consider if they can rearchitect their applications to avoid relying on them. https://pekko.apache.org/docs/pekko/current/typed/cluster-singleton.html#introduction

pjfanning avatar Apr 15 '25 13:04 pjfanning

@pjfanning If its okay ill take a go at this? I have a decent amount of k8s experience and if it ends up taking too long I can always reassign it to someone else.

mdedetrich avatar Apr 16 '25 08:04 mdedetrich

We at the Eclipse Ditto project, making use of Apache Pekko and clustering, solved that with a script - also patching the "pods deletion cost" based on how old the pods are.
Sharing it here so people could get an alternative while this is not yet done in Pekko itself.

The script:

  • https://github.com/eclipse-ditto/ditto/blob/master/deployment/helm/ditto/scripts/patch-pods-deletion-cost.sh

Which is invoked regularly by a k8s cron job:

  • https://github.com/eclipse-ditto/ditto/blob/master/deployment/helm/ditto/templates/hooks/pod-deletion-cost-cron-job.yaml

And also prior to an upgrade:

  • https://github.com/eclipse-ditto/ditto/blob/master/deployment/helm/ditto/templates/hooks/pre-upgrade-job.yaml

thjaeckle avatar Jul 24 '25 07:07 thjaeckle

Thanks @thjaeckle for sharing that. I'll create a PR over the next few days to add a section to our docs highlighting it.

pjfanning avatar Jul 24 '25 07:07 pjfanning

While I did spend some time in writing a solution for this feature a few months back, the biggest issue was in writing tests to make sure everything works as expected (also doesn't help that at the time I wasn't working at a place that was running k8s in production, also was doing this in my spare time and didn't have capacity to push through).

As a proposed alternative, a full solution was upstreamed into Akka Management 1.3.0 at March 28, 2023. which also includes a massive corpus of test suites (something that we are currently lacking a bit). It might be best to just wait until Akka Management hits 3 years, at which point it will automatically convert to Apache 2.0 License allowing use to backport it freely.

Doing this means that we can also guarantee the behaviour is the same as Akka which can help users who decide to migrate. We would have to wait ~4 and a half months, but given that Christmans/New years will be coming soon to me this looks like the best option

mdedetrich avatar Nov 10 '25 17:11 mdedetrich

Thanks @mdedetrich. In the mean time, the alternative solution provided by @thjaeckle is quite a good way to achieve the same result.

pjfanning avatar Nov 11 '25 16:11 pjfanning