Node does not exist
Describe the bug
Cluster never starts, some pods remain in init state:
NAME READY STATUS RESTARTS AGE
pulsar-mini-bookie-0 0/1 Running 0 47s
pulsar-mini-broker-0 0/1 Init:0/2 0 47s
pulsar-mini-proxy-0 0/1 Init:0/2 0 47s
pulsar-mini-toolset-0 1/1 Running 0 47s
pulsar-mini-zookeeper-0 1/1 Running 0 47s
Error in init:
pulsar-mini-proxy-0 wait-zookeeper-ready WATCHER::
pulsar-mini-proxy-0 wait-zookeeper-ready
pulsar-mini-proxy-0 wait-zookeeper-ready WatchedEvent state:SyncConnected type:None path:null zxid: -1
pulsar-mini-proxy-0 wait-zookeeper-ready Node does not exist: /admin/clusters/pulsar-mini
pulsar-mini-proxy-0 wait-zookeeper-ready 2024-02-22T14:39:47,542+0000 [main] ERROR org.apache.zookeeper.util.ServiceUtils - Exiting JVM with code 1
pulsar-mini-proxy-0 wait-zookeeper-ready Connecting to pulsar-mini-zookeeper
pulsar-mini-proxy-0 wait-zookeeper-ready 2024-02-22T14:39:54,142+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.9.1-40487256d9b9f274484798758699e49c26d91cda, built on 2023-10-02 15:06 UTC
pulsar-mini-proxy-0 wait-zookeeper-ready 2024-02-22T14:39:54,145+0000 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=pulsar-mini-proxy-0.pulsar-mini-proxy.cdr.svc.cluster.local
values file mini-values.yaml:
---
components:
## pulsar-manager: disable
pulsar_manager: false
## zookeeper
zookeeper: true
## bookkeeper
bookkeeper: true
## Disable bookkeeper - autorecovery
autorecovery: false
## broker
broker: true
## functions
functions: true
## proxy
proxy: true
## toolset
toolset: true
## disable monitoring stack
kube-prometheus-stack:
enabled: false
prometheusOperator:
enabled: false
grafana:
enabled: false
alertmanager:
enabled: false
prometheus:
enabled: false
# Disable persistence
volumes:
persistence: false
zookeeper:
replicaCount: 1
externalZookeeperServerList: ""
# Disable pod monitor since we're disabling CRD installation
podMonitor:
enabled: false
bookkeeper:
replicaCount: 1
configData:
# minimal memory use for bookkeeper
# https://bookkeeper.apache.org/docs/reference/config#db-ledger-storage-settings
dbStorage_writeCacheMaxSizeMb: "32"
dbStorage_readAheadCacheMaxSizeMb: "32"
dbStorage_rocksDB_writeBufferSizeMB: "8"
dbStorage_rocksDB_blockCacheSize: "8388608"
# Disable pod monitor since we're disabling CRD installation
podMonitor:
enabled: false
broker:
replicaCount: 1
configData:
## Enable `autoSkipNonRecoverableData` since bookkeeper is running
autoSkipNonRecoverableData: "true"
# storage settings
managedLedgerDefaultEnsembleSize: "1"
managedLedgerDefaultWriteQuorum: "1"
managedLedgerDefaultAckQuorum: "1"
podMonitor:
enabled: false
proxy:
replicaCount: 1
podMonitor:
enabled: false
Then:
helm install --values mini-values.yaml --namespace cdr pulsar-mini apache/pulsar
Expected behavior The cluster starts.
Additional context
Running ./scripts/pulsar/prepare_helm_release.sh -k pulsar-mini -n cdr before the helm install does nothing to the error.
✗ helm ls NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION pulsar-mini cdr 1 2024-02-22 15:54:31.147885 +0100 CET deployed pulsar-3.2.0 3.0.2
✗ helm version version.BuildInfo{Version:"v3.14.0", GitCommit:"3fc9f4b2638e76f26739cd77c7017139be81d0ea", GitTreeState:"clean", GoVersion:"go1.21.6"}
K8s cluster: v1.21.13
@nise-wg2 I don't see the init jobs running? Did they deploy OK?
✗ k get jobs -l "app=pulsar"
NAME COMPLETIONS DURATION AGE
pulsar-mini-bookie-init 1/1 15s 55s
pulsar-mini-pulsar-init 0/1 55s
And it fails b/c:
Warning FailedCreate 5s (x4 over 75s) job-controller Error creating: admission webhook "validation.gatekeeper.sh" denied the request: [wgtwo-reg] container <pulsar-mini-pulsar-init> has an invalid image repo <apachepulsar/pulsar-all:3.0.2>,
I had missed that.
So, there is a bug in the helm chart as this Job is not present in the https://github.com/apache/pulsar-helm-chart/blob/master/charts/pulsar/values.yaml#L137 list of images.
Requires an override in the values like:
pulsar_metadata:
image:
repository: registry.wgtwo.com/reg/cdr/testing/apachepulsar/pulsar-all
tag: 3.1.2
Then it seems to start.
k get pods -w -l "app=pulsar"
NAME READY STATUS RESTARTS AGE
pulsar-mini-bookie-0 1/1 Running 0 115s
pulsar-mini-broker-0 1/1 Running 0 115s
pulsar-mini-proxy-0 0/1 Running 0 115s
pulsar-mini-toolset-0 1/1 Running 0 115s
pulsar-mini-zookeeper-0 1/1 Running 0 115s