solr-operator
solr-operator copied to clipboard
Operator in CrashLoop with panic
We are trying to switch to operator and after image update got to point of operator crashlooping:
2022-09-30T11:08:17.926Z INFO controller-runtime.manager.controller.solrcloud Starting EventSource {"reconciler group": "solr.apache.org", "reconciler kind": "SolrCloud", "source": "kind source: /, Ki
nd="}
2022-09-30T11:08:17.927Z INFO controller-runtime.manager.controller.solrcloud Starting Controller {"reconciler group": "solr.apache.org", "reconciler kind": "SolrCloud"}
2022-09-30T11:08:17.927Z INFO controller-runtime.manager.controller.solrcloud Starting workers {"reconciler group": "solr.apache.org", "reconciler kind": "SolrCloud", "worker count": 1}
2022-09-30T11:08:18.335Z INFO controller-runtime.manager.controller.solrprometheusexporter Starting workers {"reconciler group": "solr.apache.org", "reconciler kind": "SolrPrometheusExporter",
"worker count": 1}
2022-09-30T11:08:18.430Z INFO controller-runtime.manager.controller.solrcloud Update required because field changed {"reconciler group": "solr.apache.org", "reconciler kind": "SolrCloud", "name": "sear
ch-solr-test", "namespace": "search", "statefulSet": "search-solr-test-solrcloud", "kind": "statefulSet", "field": "Spec.Template.Spec.Volumes[1].VolumeSource", "from": {"secret":{"secretName":"gcp-search-solr-cre
dentials-secret","defaultMode":420}}, "to": {"secret":{"secretName":"gcp-search-solr-credentials-secret"}}}
2022-09-30T11:08:18.432Z INFO controller-runtime.manager.controller.solrcloud Updating StatefulSet {"reconciler group": "solr.apache.org", "reconciler kind": "SolrCloud", "name": "search-solr-test", "
namespace": "search", "statefulSet": "search-solr-test-solrcloud"}
2022-09-30T11:08:18.549Z INFO controller-runtime.manager.controller.solrcloud.ManagedUpdateSelector Pod update selection started. {"reconciler group": "solr.apache.org", "reconciler kind": "SolrCloud
", "name": "search-solr-test", "namespace": "search", "outOfDatePods": 2, "maxPodsUnavailable": 1, "unavailableUpdatedPods": 0, "outOfDatePodsNotStarted": 0, "maxPodsToUpdate": 1}
2022-09-30T11:08:18.549Z INFO controller-runtime.manager.controller.solrcloud.ManagedUpdateSelector Pod killed for update. {"reconciler group": "solr.apache.org", "reconciler kind": "SolrCloud", "name
": "search-solr-test", "namespace": "search", "pod": "search-solr-test-solrcloud-0", "reason": "Pod's replicas are safe to take down, adhering to the minimum active replicas per shard."}
2022-09-30T11:08:18.549Z INFO controller-runtime.manager.controller.solrcloud.ManagedUpdateSelector Pod update selection complete. Maximum number of pods able to be updated reached. {"reconciler
group": "solr.apache.org", "reconciler kind": "SolrCloud", "name": "search-solr-test", "namespace": "search", "maxPodsToUpdate": 1}
E0930 11:08:18.550190 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 561 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x13dd140, 0x2249580})
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000760b00})
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x75
panic({0x13dd140, 0x2249580})
/usr/local/go/src/runtime/panic.go:1038 +0x215
github.com/apache/solr-operator/controllers/util.EvictReplicasForPodIfNecessary({0x176d458, 0xc0058552c0}, 0xc005393b60, 0x1c, {0x1787d28, 0xc00547e6e0})
/workspace/controllers/util/solr_update_util.go:493 +0x67
github.com/apache/solr-operator/controllers.(*SolrCloudReconciler).Reconcile(0xc0002b3e60, {0x176d458, 0xc0058552c0}, {{{0xc000885770, 0x144a920}, {0xc000885760, 0xc0008e4380}}})
/workspace/controllers/solrcloud_controller.go:428 +0x3167
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000121180, {0x176d3b0, 0xc00029c000}, {0x1427120, 0xc000760b00})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298 +0x303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000121180, {0x176d3b0, 0xc00029c000})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2({0x176d3b0, 0xc00029c000})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:216 +0x46
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:213 +0x356
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x12b3627]
goroutine 561 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000760b00})
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0xd8
panic({0x13dd140, 0x2249580})
/usr/local/go/src/runtime/panic.go:1038 +0x215
github.com/apache/solr-operator/controllers/util.EvictReplicasForPodIfNecessary({0x176d458, 0xc0058552c0}, 0xc005393b60, 0x1c, {0x1787d28, 0xc00547e6e0})
/workspace/controllers/util/solr_update_util.go:493 +0x67
github.com/apache/solr-operator/controllers.(*SolrCloudReconciler).Reconcile(0xc0002b3e60, {0x176d458, 0xc0058552c0}, {{{0xc000885770, 0x144a920}, {0xc000885760, 0xc0008e4380}}})
/workspace/controllers/solrcloud_controller.go:428 +0x3167
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000121180, {0x176d3b0, 0xc00029c000}, {0x1427120, 0xc000760b00})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298 +0x303
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000121180, {0x176d3b0, 0xc00029c000})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2({0x176d3b0, 0xc00029c000})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:216 +0x46
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0x25
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7f9eb86e9250)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0010c22a0, {0x1746200, 0xc000bfa4b0}, 0x1, 0xc0003103c0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0010ba6d0, 0x3b9aca00, 0x0, 0x0, 0xc0010c22d0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext({0x176d3b0, 0xc00029c000}, 0xc0006677e0, 0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0x99
k8s.io/apimachinery/pkg/util/wait.UntilWithContext({0x176d3b0, 0xc00029c000}, 0x0, 0x0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99 +0x2b
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:213 +0x356
SolrCloud CRD definition:
apiVersion: solr.apache.org/v1beta1
kind: SolrCloud
metadata:
name: search-solr-test
namespace: search
spec:
busyBoxImage:
repository: library/busybox
tag: 1.28.0-glibc
customSolrKubeOptions:
podOptions:
initContainers:
- name: upload-zk-config
image: xxx-docker.jfrog.io/xxx/search-solr:test-operator-3
command: ["/var/solr-ricardo/scripts/load_configs_to_zookeeper.sh"]
env:
- name: ZK_HOST
value: search-solr-test-solrcloud-zookeeper-0.search-solr-test-solrcloud-zookeeper-headless.search.svc.cluster.local:2181,search-solr-test-solrcloud-zookeeper-1.search-solr-test-solrcloud-zookeeper-headless.search.svc.cluster.local:2181,search-solr-test-solrcloud-zookeeper-2.search-solr-test-solrcloud-zookeeper-headless.search.svc.cluster.local:2181/
imagePullSecrets:
- name: xxx-docker-jfrog
envVars:
- name: ARTICLES_EXPIRATION_FIELD
value: expiration_date
- name: AUTO_DELETE_PERIOD_SECONDS
value: "3600"
- name: GCS_PROJECT_ID
value: xxxxx
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/gcp-credentials/credentials.json"
volumes:
- name: gcp-credentials
defaultContainerMount:
name: gcp-credentials
mountPath: /var/gcp-credentials
readOnly: true
source:
secret:
secretName: gcp-search-solr-credentials-secret
dataStorage:
persistent:
pvcTemplate:
metadata:
annotations:
volume.beta.kubernetes.io/storage-class: regional-ssd
name: search-solr-test
spec:
resources:
requests:
storage: 20Gi
replicas: 3
solrAddressability:
commonServicePort: 80
podPort: 8983
solrImage:
repository: xxx-docker.jfrog.io/xxx/search-solr
tag: test-operator-3
solrJavaMem: -Xms2048m -Xmx4096m
solrLogLevel: INFO
updateStrategy:
managed: { }
method: Managed
zookeeperRef:
provided:
chroot: /
config: { }
image:
pullPolicy: IfNotPresent
repository: pravega/zookeeper
replicas: 3
zookeeperPodPolicy:
resources: { }
Status on CRD:
status:
backupRestoreReady: false
internalCommonAddress: http://search-solr-test-solrcloud-common.search
podSelector: solr-cloud=search-solr-test,technology=solr-cloud
readyReplicas: 2
replicas: 3
solrNodes:
- internalAddress: http://search-solr-test-solrcloud-0.search-solr-test-solrcloud-headless.search:8983
name: search-solr-test-solrcloud-0
nodeName: gke-dev-cookie-e2-spoon-np-1fd433b7-l0z6
ready: true
specUpToDate: false
version: test-operator
- internalAddress: http://search-solr-test-solrcloud-1.search-solr-test-solrcloud-headless.search:8983
name: search-solr-test-solrcloud-1
nodeName: gke-dev-cookie-e2-fork-np-f4c2fe51-jkvi
ready: true
specUpToDate: false
version: test-operator
- internalAddress: http://search-solr-test-solrcloud-2.search-solr-test-solrcloud-headless.search:8983
name: search-solr-test-solrcloud-2
nodeName: gke-dev-cookie-e2-fork-np-3162ff26-q7vc
ready: false
specUpToDate: true
version: test-operator-3
targetVersion: test-operator-3
upToDateNodes: 1
version: test-operator
zookeeperConnectionInfo:
chroot: /
externalConnectionString: N/A
internalConnectionString: search-solr-test-solrcloud-zookeeper-0.search-solr-test-solrcloud-zookeeper-headless.search.svc.cluster.local:2181,search-solr-test-solrcloud-zookeeper-1.search-solr-test-solrcloud-zookeeper-headless.search.svc.cluster.local:2181,search-solr-test-solrcloud-zookeeper-2.search-solr-test-solrcloud-zookeeper-headless.search.svc.cluster.local:2181
Let me know if I can provide more info which can help you to investigate an issue.
Thanks for finding this bug! You can fix it yourself locally by not setting a custom name for your data PVCs, but we will get a fix in as soon as possible. Not sure when the next release will be, but this will for sure be included.
hey, yes, I've figured as well that changing PVC name to data fixed crashing for us. We'll try removing name as well, thanks for the feedback and fix! 👍🏻