Operator throwing error in an endless loop "too old resource version"
Bug Report
What did you do?
We are not sure of the events that led to this. It started occurring suddenly. A restart has fixed it though but the operator was non-functional by this time i.e. it was not reconciling anything
What did you expect to see?
No errors
What did you see instead? Under which circumstances?
Our operator is throwing the following in a endless loop
2024-04-19 08:59:46,858 i.f.k.c.d.i.AbstractWatchManager [ERROR] Received an error which is not a status but {"type":"ERROR","object":{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"too old resource version: 31159423 (31160199)","reason":"Expired","code":410}} - will retry
Environment
Kubernetes cluster type: EKS
$ Mention java-operator-sdk version from pom.xml file
4.8.2
$ java -version
openjdk version "21.0.2" 2024-01-16 LTS
OpenJDK Runtime Environment Corretto-21.0.2.13.1 (build 21.0.2+13-LTS)
OpenJDK 64-Bit Server VM Corretto-21.0.2.13.1 (build 21.0.2+13-LTS, mixed mode, sharing)
$ kubectl version
Client Version: v1.29.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.1-eks-b9c9ed7
Possible Solution
Additional context
Unfortunately no, this error has no logs prior to it and it just started occuring out of the blue. We are using 6.10.0 fabric8 client
actually, fabric8 6.11.0 is in the dependency tree
This seems to be an issue with the watches in fabric8 client. cc @manusa @shawkins
Classloading issues make this logic subseptiable to this problem - https://github.com/fabric8io/kubernetes-client/issues/5692
We could consider making the deserialization here to just generic instead, but more than likely the user will want to fix having more than one definition of Status in the classpath.
Hi @shawkins
but more than likely the user will want to fix having more than one definition of Status in the classpath
I'm not sure what is this Status you are referring to. Are you saying I look at my mvn dependency:tree?
this is how the deps look like
[INFO] +- io.javaoperatorsdk:operator-framework:jar:4.8.2:compile
[INFO] | +- io.javaoperatorsdk:operator-framework-core:jar:4.8.2:compile
[INFO] | | \- io.fabric8:kubernetes-client:jar:6.11.0:compile
.
.
.
.
[INFO] +- io.strimzi:api:jar:0.40.0:compile
[INFO] | +- io.fabric8:kubernetes-client-api:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-gatewayapi:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-resource:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-rbac:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-admissionregistration:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-apps:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-autoscaling:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-batch:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-certificates:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-coordination:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-discovery:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-events:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-extensions:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-flowcontrol:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-metrics:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-policy:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-scheduling:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-storageclass:jar:6.10.0:compile
[INFO] | | +- io.fabric8:kubernetes-model-node:jar:6.10.0:compile
[INFO] | | +- org.snakeyaml:snakeyaml-engine:jar:2.7:compile
[INFO] | | \- com.fasterxml.jackson.datatype:jackson-datatype-jsr310:jar:2.16.0:compile
[INFO] | +- io.fabric8:kubernetes-model-core:jar:6.10.0:compile
[INFO] | +- io.fabric8:kubernetes-model-networking:jar:6.10.0:compile
[INFO] | +- io.fabric8:kubernetes-model-common:jar:6.10.0:compile
[INFO] | +- io.fabric8:kubernetes-model-apiextensions:jar:6.10.0:compile
is it better to keep the fabric8 version consistent?
@fhalde yes it is especially if you don't have a flat classloader and end up with two different Status class definitions accessible from different classloaders.
hmm, we definitely don't make use of any classloaders. is this some fabric8 internals? anyway here is what my fat jar contents look like
jar -tvf operator.jar | grep '/Status.class'
io/javaoperatorsdk/operator/health/Status.class
io/strimzi/api/kafka/model/kafka/Status.class
io/fabric8/kubernetes/api/model/Status.class
org/apache/logging/log4j/core/util/internal/Status.class
ch/qos/logback/core/status/Status.class
@shawkins
Can you try to make sure that the fabric8 client version that gets put into your fat jar is the same version as the one used by JOSDK?
Hi @metacosm , we were running our operator with a single version of fabric8 for a few days and today this error came up once again
here is what i could gather by attaching a debugger. the status message was unmarshalled into a GenericKubernetesResource class rather than Status. Weirdly the error stopped after a while after I attached a remote debugger
If this comes up once again i'll let you know.
will close this issue, pls let us know if that happens again.