controller-runtime Fake client refuses to initialize with the objects that have deletionTimestamp set but no finalizers

If you try to create a fake client with the following code,

func TestControllerReconcileWhenObjectDeleted(t *testing.T) {
        client := fake.NewClientBuilder().WithObjects(&corev1.Pod{
                ObjectMeta: metav1.ObjectMeta{
			Name:              "example",
			Namespace:         "default",
			DeletionTimestamp: &metav1.Time{Time: time.Now()},
		},
        }).Build()
}

the test will fail because

refusing to create obj example with metadata.deletionTimestamp but no finalizers

That makes no sense. These objects can just exist and the controllers should behave correctly in such cases. The restriction prevents us from simulating such a case with the fake client.

The issue was introduced in https://github.com/kubernetes-sigs/controller-runtime/commit/7a66d580c0c53504f5b509b45e9300cc18a1cc30.

May 29 '23 07:05 arkbriar

It does make sense, because deletiontimestamp-without-a-finalizer is a state you won't be able to observe in a real kube-apiserver either

May 29 '23 09:05 alvaroaleman

It does make sense, because deletiontimestamp-without-a-finalizer is a state you won't be able to observe in a real kube-apiserver either

I don't think so. Actually, you can simply get such an object by

Apply the Pod below and wait until it starts
Delete it

apiVersion: v1
kind: Pod
metadata:
  name: d
spec:
  terminationGracePeriodSeconds: 600
  containers:
  - name: d
    image: busybox:latest
    command:
    - sh
    args:
    - -c
    - while :; do sleep 1000; done
    lifecycle:
      preStop:
        exec:
          command: ["sleep", "10000"]

Here's an example:

$ k get pod d -o jsonpath='{"deletionTimestamp: "}{.metadata.deletionTimestamp}{"\nfinalizers: "}{.metadata.finalizers}'
deletionTimestamp: 2023-05-29T10:06:45Z
finalizers:

In fact, deletion in Kubernetes is just an alias of setting the deletionTimestamp to a non-zero value. It doesn't mean that the object will be deleted immediately. And there are many ways other than the finalizer that will keep it from the real deletion.

May 29 '23 09:05 arkbriar

True, pods are an exception to this to some degree. What other many ways other than finalizers are you aware of that will keep an object with deletion time stamp around?

May 29 '23 11:05 alvaroaleman

True, pods are an exception to this to some degree. What other many ways other than finalizers are you aware of that will keep an object with deletion time stamp around?

Failing apiservices might block the deletion of namespaces. I didn't read through the codes of the api services so it might be true for the custom resources. Anyway, I'm not here to debate that there are that many ways but I'd say the controller runtime should be able to support any cases including the Pods.

May 29 '23 12:05 arkbriar

The issue is that these cases are the exception, not the norm. What is implemented as of today is the default, which is that deleted objects only exist when they have a finalizer. We are open for PRs that correct the behavior for the resources for which the default doesn't apply. Other than that, you could use the newly-added interceptor to achieve other custom behavior.

All of that being said, assuming you will observe a deletion without using a finalizer is unsafe even for objects like pods, because your controller might be down. So a better way is likely to simply start using a finalizer.

May 29 '23 12:05 alvaroaleman

We are open for PRs that correct the behavior for the resources for which the default doesn't apply.

Sure, I will start a PR to fix the behaviour for Pods/Namespaces.

Other than that, you could use the newly-added interceptor to achieve other custom behavior.

Thanks for the reference. Will take a look.

assuming you will observe a deletion without using a finalizer is unsafe even for objects like pods, because your controller might be down.

Controllers working against Pods must deal with the situations to prevent from unexpected failures. It's not about safety. I think it makes sense at least for these cases.

May 29 '23 12:05 arkbriar

Sure, I will start a PR to fix the behaviour for Pods/Namespaces.

Namespaces actually have a finalizer that will remain if cleanup doesn't succeed for whatever reason. And pods only exhibit this behavior when they are running I think. Please look up the precise pod behavior, add the same to the fake client and a link to the upstream source.

May 29 '23 13:05 alvaroaleman

/kind bug support

Aug 28 '23 13:08 troy0820

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jan 27 '24 04:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Feb 26 '24 05:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Mar 27 '24 06:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mar 27 '24 06:03 k8s-ci-robot