Data loss when upgrading storage requests
When creating a resource of type: ClickHouseInstallation
the initial requested storage is 10Gi as shown in the snippet below:
- name: data-volumeclaim-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
If I want to increase this storage to 30Gi, the data and the tables are lost and I need to re-create them.
The blog here: https://altinity.com/blog/preventing-clickhouse-storage-deletion-with-the-altinity-kubernetes-operator-reclaimpolicy offers adding reclaimPolicy: Retain in the spec file.
Will this help in preventing data loss when increasing the storage requests as well? Please let me know!
yes, reclaimPolicy: Retain allow avoid data loss if you scaledown/delete statefull set or pods
but your case looks like not related to reclaimPolicy
which is your CSI driver which handler default StorageClass? could you share results
kubectl get storageclass
is your default storage class have allowVolumeExpansion: true ?
@Slach yes I have allowVolumeExpansion: true in the storage class.
Also, like I said, the issue is about upgrading storage. When I do that, the storage indeed gets upgraded but then, all the tables and existing data are lost and I need to re-create them.
Is this expected?
Once again, I am not doing the following: scaledown/delete statefull set or pods
What is your CSI driver? what is your kubernetes provider?
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 4y135d
and on describing, I get this:
Name: gp2
IsDefaultClass: Yes
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"},"name":"gp2"},"parameters":{"fsType":"ext4","type":"gp2"},"provisioner":"kubernetes.io/aws-ebs","volumeBindingMode":"WaitForFirstConsumer"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner: kubernetes.io/aws-ebs
Parameters: fsType=ext4,type=gp2
AllowVolumeExpansion: True
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: WaitForFirstConsumer
Events: <none>
ok. this aws-ebs so data should not lose in this use case
could you share
kubectl get chi --all-namespaces
kubectl get chi --all-namespaces -l app=clickhouse-operator
kubectl get chi --all-namespaces
NAMESPACE NAME CLUSTERS HOSTS STATUS HOSTS-COMPLETED AGE
logging ch-installation 1 6 Completed 5d21h
kubectl get pod --all-namespaces -l app=clickhouse-operator
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system clickhouse-operator-6c85c654cb-2zft9 2/2 Running 0 27d
Let's check clickhouse-operator version
kubectl get pod -n kube-system clickhouse-operator-6c85c654cb-2zft9 -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{range .spec.containers[*]}{.name}{"=>"}{.image}{", "}{end}{"\n"}{end}'
@madhur-df ,by default PVCs are controlled by statefulset that requires to re-create pods. Operator can manage PVCs directly, this is controlled by a setting in CHI:
defaults:
storageManagement:
provisioner: Operator
But in any case there can not no data loss. If pod is recreated or restarted, the data on persistent disks retains. Are you sure you are using persistent disk for ClickHouse data? It should be mounted at /var/lib/clickhouse
Based on this comment you storage class has set ReclaimPolicy to Delete - when operator recreates the StatefulSet (as volume size is immutable) k8s can delete the volume during that period.
@ondrej-smola , "ReclaimPolicy: Delete" is ok. It means PV will be deleted if PVC is deleted. It s not the case here. When STS is re-created, PVC should not be affected.
I tried to reproduce this problem, but failed. After deploying clickhouse through ClickHouseInstallation, creating tables and databases, and then kubectl edit chi chi-addon-clickhouse-demo-clickhouse-0-0, changing the storage of volumeClaimTemplate from 10Gi to 30Gi, saving and exiting, sts did take effect and the pod was restarted, and then relinking clickhouse found that the tables and data were still there.
Because sts.Spec.persistentVolumeClaimRetentionPolicy.whenDeleted/whenScaled has been set to Retain, setting Retain is effective and will not cause data loss
even if sts or scale is deleted, pv will be retained
Unfortunately, I am still facing this issue. I upgraded storage from 300Gi to 500Gi and I was using 3 shards and 2 replicas with zookeeper.
All my existing data was gone from clickhouse. But, it didn't let me create the table again because zookeeper still had old metadata about partitions etc.