clickhouse-operator icon indicating copy to clipboard operation
clickhouse-operator copied to clipboard

keeper PersistentVolumeClaims remain OutOfSync (ArgoCD)

Open eranay-cu opened this issue 5 months ago • 1 comments

Environment

  • ClickHouse Operator Version: 0.25.2
  • ClickHouse Image: clickhouse-server:25.3
  • Kubernetes Version: v1.31.5
  • Cloud Provider: Azure

We are observing that PersistentVolumeClaim (PVC) resources for our internal keeper nodes are constantly showing an OutOfSync status in ArgoCD.

The resources are marked as Healthy, but they remain out of sync even when no changes have been made to our ClickHouseInstallation manifest. This requires manual syncing in ArgoCD.

Steps to Reproduce:

  1. Deploy a ClickHouseInstallation resource with an internal keeper cluster using version 0.25.2 of the clickhouse-operator.
  2. Manage the installation using ArgoCD.
  3. Observe the status of the keeper-N-keeper-... PVCs in the ArgoCD UI.

Expected Behavior: All resources managed by the operator, including the keeper PVCs, should have a Synced status in ArgoCD after the initial deployment.

Referenced bugs i read: #1714 #958

Image Image

clickhouse installation:

apiVersion: clickhouse.altinity.com/v1 kind: ClickHouseInstallation metadata: name: my-clickhouse-instance namespace: my-namespace spec: configuration: clusters: - name: my-cluster layout: shardsCount: 1 replicasCount: 2 files: config.d/extra_config.xml: | <clickhouse> <prometheus> <endpoint>/metrics</endpoint> <port>9363</port> </prometheus> </clickhouse> config.d/openssl.xml: | <clickhouse> <openSSL> <server> <certificateFile>/etc/clickhouse-server/secrets.d/tls.crt/my-tls-secret/tls.crt</certificateFile> <privateKeyFile>/etc/clickhouse-server/secrets.d/tls.key/my-tls-secret/tls.key</privateKeyFile> <verificationMode>none</verificationMode> </server> <client> <verificationMode>none</verificationMode> </client> </openSSL> </clickhouse> settings: https_port: 8443 interserver_https_port: 9009 logger/level: warning tcp_port_secure: 9440 users: my-user/password: valueFrom: secretKeyRef: # Generic secret name for credentials name: my-user-credentials key: password my-user/profile: default my-user/networks/ip: "::/0" zookeeper: nodes: - host: my-keeper-service.my-namespace.svc.cluster.local port: 2181 defaults: templates: dataVolumeClaimTemplate: my-data-volume-template podTemplate: my-pod-template

eranay-cu avatar Aug 24 '25 17:08 eranay-cu

As a workaround until this issue is solved, there's a not-yet-released Helm chart configuration value you can leverage to at least prevent ArgoCD from pruning/deleting the PVCs (and therefore the associated PVs) in certain sync scenarios:

keeper:
  volumeClaimAnnotations:
    argocd.argoproj.io/compare-options: IgnoreExtraneous

(see Altinity/helm-charts#56)

We've already been using this and you should be able to have your ArgoCD instance deploy a patched Clickhouse Helm chart with that change.

fkywong avatar Sep 15 '25 18:09 fkywong