Operator fails to update status due to size
clickhouse-operator E0619 08:13:59.022232 1 controller.go:748] updateCHIObjectStatus():clickhouse/clickhouse-production/55d39e29-fdf9-43b3-9d82-51d7aeefb7dc:got error, all retries are exhausted. err: "rpc error: code = ResourceExhausted desc = trying to send message larger than max (3055012 vs. 2097152)"
We have a fairly large single CHI and are getting this error in the operator now
A large chunk of the status section is the storage.xml from normalizedCompleted, around 420000 bytes of the 2097152 max.
This also seems to affect sections of the code like https://github.com/Altinity/clickhouse-operator/blob/0.24.0/pkg/model/chop_config.go#L151 where the status is used to compare old and new, causing unnecessary restarts.
One potential solution could be allowing configurable storage, and allowing object storage to be used for state storage rather than the k8s status
One workaround can be to bake some of your configuration into clickhouse image itself - reducing status size under max object size allowed by api-server
@tanner-bruce , is it possible to share full CHI? I can see you are using configuration on shard level -- what is a reason for that? Maybe the CHI can be made more compact.
@alex-zaitsev I'm not sure what shard level configuration. We have some different node types and then we have different disk sizes for some clusters.
@ondrej-smola That is a good idea, we could for sure do the storage xml, but I think that is it
Currently we are looking at splitting up the different clusters into their own CHI and then use cluster discovery to link them to our query pods, but not sure how to migrate to that.
Here is our full CHI, redacted mildly.
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "clickhouse"
spec:
configuration:
profiles:
clickhouse_operator/skip_unavailable_shards: 1
materialize_ttl_after_modify: 0
default/skip_unavailable_shards: 1
readonly/readonly: 1
settings:
async_insert_threads: 30
background_common_pool_size: 24
background_distributed_schedule_pool_size: 24
background_move_pool_size: 12
background_pool_size: 36
background_schedule_pool_size: 24
logger/level: debug
max_table_size_to_drop: 0
prometheus/asynchronous_metrics: true
prometheus/endpoint: /metrics
prometheus/events: true
prometheus/metrics: true
prometheus/port: "8888"
prometheus/status_info: true
clusters:
- name: "7"
layout:
shardsCount: 14
replicasCount: 2
templates:
dataVolumeClaimTemplate: data-7
- name: "6"
layout:
shardsCount: 14
replicasCount: 2
templates:
dataVolumeClaimTemplate: data-6
- name: "5"
layout:
shardsCount: 14
replicasCount: 2
templates:
dataVolumeClaimTemplate: data-5
podTemplate: ingest-2-pod-template
- name: "4"
layout:
shardsCount: 8
replicasCount: 2
templates:
dataVolumeClaimTemplate: data-4
- name: "3"
layout:
shardsCount: 8
replicasCount: 2
templates:
dataVolumeClaimTemplate: data-3
- name: "2"
layout:
shardsCount: 2
replicasCount: 2
templates:
dataVolumeClaimTemplate: data-2
- name: "1"
layout:
shardsCount: 1
replicasCount: 2
templates:
dataVolumeClaimTemplate: data-1
- name: "query"
templates:
clusterServiceTemplate: query-service
dataVolumeClaimTemplate: query-data
podTemplate: query-pod-template
layout:
shardsCount: 4
replicasCount: 1
files:
conf.d/storage.xml: "<clickhouse> <storage_configuration> <disks> <gcs> <type>s3</type> <access_key_id from_env=\"GCS_ACCESS_KEY\" /> <secret_access_key from_env=\"GCS_SECRET_KEY\" /> <endpoint from_env=\"GCS_ENDPOINT\" /> <metadata_path>/var/lib/clickhouse/disks/gcs/</metadata_path> <support_batch_delete>false</support_batch_delete> </gcs> <gcs_6m> <type>s3</type> <access_key_id from_env=\"GCS_ACCESS_KEY\" /> <secret_access_key from_env=\"GCS_SECRET_KEY\" /> <endpoint from_env=\"GCS_ENDPOINT_6M_RETENTION\" /> <metadata_path>/var/lib/clickhouse/disks/gcs_6m/</metadata_path> <support_batch_delete>false</support_batch_delete> </gcs_6m> <gcs_1y> <type>s3</type> <access_key_id from_env=\"GCS_ACCESS_KEY\" /> <secret_access_key from_env=\"GCS_SECRET_KEY\" /> <endpoint from_env=\"GCS_ENDPOINT_1Y_RETENTION\" /> <metadata_path>/var/lib/clickhouse/disks/gcs_1y/</metadata_path> <support_batch_delete>false</support_batch_delete> </gcs_1y> <gcs_cache> <type>cache</type> <disk>gcs</disk> <cache_enabled>true</cache_enabled> <data_cache_enabled>true</data_cache_enabled> <enable_filesystem_cache>true</enable_filesystem_cache> <path>/var/lib/clickhouse/disks/gcscache/</path> <enable_filesystem_cache_log>true</enable_filesystem_cache_log> <max_size>10Gi</max_size> </gcs_cache> <gcs_6m_cache> <type>cache</type> <disk>gcs_6m</disk> <cache_enabled>true</cache_enabled> <data_cache_enabled>true</data_cache_enabled> <enable_filesystem_cache>true</enable_filesystem_cache> <path>/var/lib/clickhouse/disks/gcscache_6m/</path> <enable_filesystem_cache_log>true</enable_filesystem_cache_log> <max_size>10Gi</max_size> </gcs_6m_cache> <gcs_1y_cache> <type>cache</type> <disk>gcs_1y</disk> <cache_enabled>true</cache_enabled> <data_cache_enabled>true</data_cache_enabled> <enable_filesystem_cache>true</enable_filesystem_cache> <path>/var/lib/clickhouse/disks/gcscache_1y/</path> <enable_filesystem_cache_log>true</enable_filesystem_cache_log> <max_size>95Gi</max_size> </gcs_1y_cache> <ssd> <type>local</type> <path>/var/lib/clickhouse/disks/localssd/</path> </ssd> </disks> <policies> <storage_main> <volumes> <ssd> <disk>ssd</disk> </ssd> <gcs> <disk>gcs_cache</disk> <perform_ttl_move_on_insert>0</perform_ttl_move_on_insert> <prefer_not_to_merge>true</prefer_not_to_merge> </gcs> <gcs_6m> <disk>gcs_6m_cache</disk> <perform_ttl_move_on_insert>0</perform_ttl_move_on_insert> <prefer_not_to_merge>true</prefer_not_to_merge> </gcs_6m> <gcs_1y> <disk>gcs_1y_cache</disk> <perform_ttl_move_on_insert>0</perform_ttl_move_on_insert> <prefer_not_to_merge>true</prefer_not_to_merge> </gcs_1y> </volumes> <move_factor>0.1</move_factor> </storage_main> </policies> </storage_configuration> </clickhouse>"
zookeeper:
nodes:
- host: clickhouse-keeper
port: 2181
session_timeout_ms: 30000
operation_timeout_ms: 10000
root: /root
identity: user:password
users:
migrations/access_management: 1
migrations/k8s_secret_password: clickhouse/clickhouse
migrations/networks/ip: "::/0"
exporter/k8s_secret_password: clickhouse/clickhouse
exporter/networks/ip: "::/0"
grafana/k8s_secret_password: clickhouse/clickhouse
grafana/networks/ip: "::/0"
grafana/grants/query:
- GRANT SELECT ON *.*
- REVOKE ALL PRIVILEGES ON .
- REVOKE ALL PRIVILEGES ON .
- REVOKE ALL PRIVILEGES ON .
- REVOKE ALL PRIVILEGES ON .
- REVOKE ALL PRIVILEGES ON .
- REVOKE ALL PRIVILEGES ON .
api/k8s_secret_password: clickhouse/clickhouse
api/networks/ip: "::/0"
defaults:
templates:
logVolumeClaimTemplate: logs
podTemplate: ingest-pod-template
serviceTemplate: default-service
clusterServiceTemplate: cluster-ingest-service
storageManagement:
reclaimPolicy: Retain
templates:
serviceTemplates:
- name: default-service
generateName: clickhouse-{chi}
metadata:
annotations:
networking.gke.io/load-balancer-type: "Internal"
networking.gke.io/internal-load-balancer-allow-global-access: "true"
spec:
ports:
- name: http
port: 8123
- name: tcp
port: 9000
type: LoadBalancer
- name: cluster-ingest-service
generateName: ingest-{cluster}-{chi}
metadata:
annotations:
networking.gke.io/load-balancer-type: "Internal"
networking.gke.io/internal-load-balancer-allow-global-access: "true"
spec:
ports:
- name: http
port: 8123
- name: tcp
port: 9000
type: LoadBalancer
- name: query-service
generateName: query-{chi}
metadata:
annotations:
networking.gke.io/load-balancer-type: "Internal"
networking.gke.io/internal-load-balancer-allow-global-access: "true"
spec:
ports:
- name: http
port: 8123
- name: tcp
port: 9000
type: LoadBalancer
podTemplates:
- name: ingest-pod-template
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/schema: "http"
prometheus.io/port: "8888"
prometheus.io/path: "/metrics"
spec:
tolerations:
- key: "app"
operator: "Equal"
value: "ingest"
effect: "NoExecute"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- ingest
containers:
- env:
- name: SHARD_BUCKET_PATH
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CLUSTER
valueFrom:
fieldRef:
fieldPath: metadata.labels['clickhouse.altinity.com/cluster']
- name: GCS_ENDPOINT
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
- name: GCS_ENDPOINT_6M_RETENTION
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
- name: GCS_ENDPOINT_1Y_RETENTION
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
envFrom:
- secretRef:
name: clickhouse
image: clickhouse-server:24.5.1.1763
name: clickhouse
startupProbe:
httpGet:
path: /ping
port: http
scheme: HTTP
failureThreshold: 100
periodSeconds: 9
timeoutSeconds: 1
livenessProbe:
failureThreshold: 100
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 300
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 1
ports:
- name: "metrics"
containerPort: 8888
resources:
limits:
memory: 10Gi
requests:
cpu: 1000m
memory: 10Gi
volumeMounts:
- name: cache
mountPath: /var/lib/clickhouse/disks/gcscache
- name: cache-6m
mountPath: /var/lib/clickhouse/disks/gcscache_6m
- name: cache-1y
mountPath: /var/lib/clickhouse/disks/gcscache_1y
- name: ingest-2-pod-template
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/schema: "http"
prometheus.io/port: "8888"
prometheus.io/path: "/metrics"
spec:
tolerations:
- key: "app"
operator: "Equal"
value: "ingest-2"
effect: "NoExecute"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- ingest-2
containers:
- env:
- name: SHARD_BUCKET_PATH
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CLUSTER
valueFrom:
fieldRef:
fieldPath: metadata.labels['clickhouse.altinity.com/cluster']
- name: GCS_ENDPOINT
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
- name: GCS_ENDPOINT_6M_RETENTION
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
- name: GCS_ENDPOINT_1Y_RETENTION
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
envFrom:
- secretRef:
name: clickhouse
image: clickhouse-server:24.5.1.1763
name: clickhouse
startupProbe:
httpGet:
path: /ping
port: http
scheme: HTTP
failureThreshold: 100
periodSeconds: 9
timeoutSeconds: 1
livenessProbe:
failureThreshold: 100
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 300
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 1
ports:
- name: "metrics"
containerPort: 8888
resources:
limits:
memory: 10Gi
requests:
cpu: 1000m
memory: 10Gi
volumeMounts:
- name: cache
mountPath: /var/lib/clickhouse/disks/gcscache
- name: cache-6m
mountPath: /var/lib/clickhouse/disks/gcscache_6m
- name: cache-1y
mountPath: /var/lib/clickhouse/disks/gcscache_1y
- name: query-pod-template
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/schema: "http"
prometheus.io/port: "8888"
prometheus.io/path: "/metrics"
spec:
tolerations:
- key: "app"
operator: "Equal"
value: "ingest"
effect: "NoExecute"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- ingest
containers:
- env:
- name: SHARD_BUCKET_PATH
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER
valueFrom:
fieldRef:
fieldPath: metadata.labels['clickhouse.altinity.com/cluster']
- name: GCS_ENDPOINT
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
- name: GCS_ENDPOINT_6M_RETENTION
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
- name: GCS_ENDPOINT_1Y_RETENTION
value: bucket/$(POD_NAMESPACE)/$(CLUSTER)/$(SHARD_BUCKET_PATH)
envFrom:
- secretRef:
name: clickhouse
image: clickhouse-server:24.5.1.1763
name: clickhouse
startupProbe:
httpGet:
path: /ping
port: http
scheme: HTTP
failureThreshold: 40
periodSeconds: 3
timeoutSeconds: 1
livenessProbe:
failureThreshold: 10
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
ports:
- name: "metrics"
containerPort: 8888
resources:
limits:
memory: 10Gi
requests:
cpu: 1000m
memory: 10Gi
volumeClaimTemplates:
- name: data-7
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: data-6
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: data-5
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
- name: data-4
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: data-3
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: data-2
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: data-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: query-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: cache
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: cache-6m
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: cache-1y
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ssd
- name: logs
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
I would definitely consider moving to multiple CHI objects and make shared configuration generated by some (git)ops tool . If I understand it correctly it started after adding clusters 5, 6 and 7?
@tanner-bruce , did it help after splitting clusters to multiple CHIs?
@alex-zaitsev We ran in to this on a single cluster now (after splitting). Trying to embed the storage.xml from the {files: { conf.d/storage.xml: .. } via configmap now and I'm not having much luck to mount it in the conf.d location because multiple configmaps cannot be mounted to the same folder without using a projected volume. Or I would need to use an init container. Do you have any thoughts here?
@tanner-bruce did you try
spec:
configuration:
files:
- conf.d/storage.xml: |
<content>
?
@Slach you can see my CHI here where that is what is already in it.
This gets replicated for every stateful set inside the status causing huge amount of character usage
it is in the normalizedCompleted
normalizedCompleted:
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
creationTimestamp: "2024-01-25T16:15:26Z"
finalizers:
- finalizer.clickhouseinstallation.altinity.com
generation: 23
name: tracing
namespace: clickhouse-production
resourceVersion: "914532393"
uid: 1a5d08aa-ca86-455e-a1fc-145119b78ad9
spec:
configuration:
clusters:
- files:
conf.d/storage.xml: ....
We plan to move normalized from status to a separate configmap.
@tanner-bruce, this is fixed in 0.24.3. normalizedCHI is moved to a separate configmap instead of status.
Fixed in https://github.com/Altinity/clickhouse-operator/pull/1623
@alex-zaitsev we are now hitting configmap max size
W0417 16:33:16.064626 1 cr.go:121] statusUpdateRetry():clickhouse-core/core/9a326975-286c-4306-96e4-abab9a8391a8:got error, will retry. err: "ConfigMap \"chi-storage-core\" is invalid: []: Too long: must have at most 1048576 bytes"
W0417 16:33:16.465528 1 cr.go:121] statusUpdateRetry():clickhouse-core/core/9a326975-286c-4306-96e4-abab9a8391a8:got error, will retry. err: "ConfigMap \"chi-storage-core\" is invalid: []: Too long: must have at most 1048576 bytes"