dotnet-operator-sdk Microsoft.Rest.HttpOperationException: Operation returned an invalid status code 'NotFound' when CRDs are missing.

Describe the bug

I'm not entirely sure if this is a bug or as designed, so please bear with me.

KubeOps v6.6.0

Pod throws an uncaught Microsoft.Rest.HttpOperationException: Operation returned an invalid status code 'NotFound' exception and dies on starting. Restarting the pod fixes the issue.

The pod apparently tries to watch custom resources (intentionally) that are not present at the moment when the pod starts. Putting a dependency (deploying CRDs before starting the pod) fixes the issue. Also, we have recently upgraded to k8s 1.22, and this was not occurring before.

fail: KubeOps.Operator.Kubernetes.ResourceWatcher[0]
      There was an error while watching the resource "MyResource".
      Microsoft.Rest.HttpOperationException: Operation returned an invalid status code 'NotFound'
         at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
         at k8s.Kubernetes.ListClusterCustomObjectWithHttpMessagesAsync(String group, String version, String plural, Nullable`1 allowWatchBookmarks, String continueParameter, String fieldSelector, String labelSelector, Nullable`1 limit, String resourceVersion, String resourceVersionMatch, Nullable`1 timeoutSeconds, Nullable`1 watch, Nullable`1 pretty, IDictionary`2 customHeaders, CancellationToken cancellationToken)
         at k8s.WatcherExt.<>c__DisplayClass1_0`2.<<MakeStreamReaderCreator>b__0>d.MoveNext()
      --- End of stack trace from previous location ---
         at k8s.Watcher`1.<>c.<CreateWatchEventEnumerator>b__21_1[TR](Task`1 t)
         at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
         at System.Threading.Tasks.Task.<>c.<.cctor>b__272_0(Object obj)
         at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
      --- End of stack trace from previous location ---
         at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
      --- End of stack trace from previous location ---
         at k8s.Watcher`1.CreateWatchEventEnumerator(Func`1 streamReaderCreator, Action`1 onError, CancellationToken cancellationToken)+MoveNext()
         at k8s.Watcher`1.CreateWatchEventEnumerator(Func`1 streamReaderCreator, Action`1 onError, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource<System.Boolean>.GetResult()
         at k8s.Watcher`1.WatcherLoop(CancellationToken cancellationToken)
         at k8s.Watcher`1.WatcherLoop(CancellationToken cancellationToken)

To Reproduce

Deploy the application watching a custom resource that does not exist in the cluster. (edited: cleared the step)

Expected behavior

The pod would not crash and wait for the CRDs to become available.

Sep 13 '22 15:09 alexander-klang

Hey @alexander-klang

It should not crash the whole pod. It should just throw the error and then perform an exponential backoff. Is it really crashing the whole pod?

Sep 14 '22 04:09 buehler

It prints the error and the pod is still in the "ready" state, but the reconcile method is never called again and no operations on objects are performed.

Sep 14 '22 07:09 ghost