controller-runtime Finalizer is removed after updating object status

We recently wanted to refactor how we update our CR with default values, and ran into an issue where a finalizer is removed upon updating the status subresource.

First, we get our CR, called Application:

application := &skiperatorv1alpha1.Application{}
err := r.GetClient().Get(ctx, req.NamespacedName, application)

Then we create a temporary application with the cluster state and apply defaults to the working application

tmpApplication := application.DeepCopy()
application.FillDefaultsSpec()
if !ctrlutil.ContainsFinalizer(application, applicationFinalizer) {
  ctrlutil.AddFinalizer(application, applicationFinalizer)
}

if len(application.Labels) == 0 {
  application.Labels = application.Spec.Labels
} else {
  aggregateLabels := application.Labels
  maps.Copy(aggregateLabels, application.Spec.Labels)
  application.Labels = aggregateLabels
}

application.FillDefaultsStatus()

We then diff the spec and status of the object, and update as such

if len(specDiff) > 0 {
  err := r.GetClient().Update(ctx, application)
  return reconcile.Result{Requeue: true}, err
}

if len(statusDiff) > 0 {
  err := r.GetClient().Status().Update(ctx, application)
  return reconcile.Result{Requeue: true}, err
}

When breakpointing on the status update, the finalizer field is set, however, when we step over to the return the finalizer is removed after updating the status of the Application.

I realise as I'm writing this that this may simply be circumvented by updating the status first, but I still want to post the issue as this does not seem like it is the expected behaviour.

Aug 18 '23 13:08 anderssonw

Yeah, this presumably happens because we zero the target object and the response only contains the status: https://github.com/kubernetes-sigs/controller-runtime/blob/304027bcbe4b3f6d582180aec5759eb4db3f17fd/pkg/client/apiutil/apimachinery.go#L216

The zeroing is done because the deserialization of go doesn't always do that, which makes that if you deserialize into a non-zeroed object, the resulting object might have data that was not in the response.

This is definitely not intended, but not sure if there is an easy fix for this.

/kind bug

Aug 19 '23 20:08 alvaroaleman

I understand, thanks for the information! We will definitely sort out a workaround, so not a huge worry :)

Aug 20 '23 19:08 anderssonw

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jan 26 '24 17:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Feb 25 '24 18:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Mar 26 '24 19:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mar 26 '24 19:03 k8s-ci-robot