use_environment_credentials is not working when using IRSA
ClickHouse server version 24.1.2.5 ClickHouse backup version: 2.6.2
In my clickhouse setup I set use_environment_credentials to true for s3 disk, but when using remote backup it cannot uses service account credentials
<storage_configuration>
<disks>
<s3_backup>
<type>s3</type>
<endpoint>https://xxxxxxxxxxx.s3.amazonaws.com/</endpoint>
<use_environment_credentials>true</use_environment_credentials>
</s3_backup>
<!--
default disk is special, it always exists even if not explicitly configured here,
but you can't change it's path here (you should use <path> on top level config instead)
-->
<default>
<!--
You can reserve some amount of free space on any disk (including default) by adding
keep_free_space_bytes tag.
-->
<keep_free_space_bytes>10485760</keep_free_space_bytes>
</default>
<s3>
<type>s3</type>
<endpoint>https://xxxxxxxxxxxx.s3.amazonaws.com/data2/</endpoint>
<use_environment_credentials>true</use_environment_credentials>
</s3>
</disks>
The following warn is shown
2024-10-10 21:40:38.196 WRN pkg/storage/object_disk/object_disk.go:361 > /var/lib/clickhouse/preprocessed_configs/config.xml -> //storage_configuration/disks/s3_backup doesn't contains <access_key_id> and <secret_access_key> environment variables will use
2024-10-10 21:40:38.200 WRN pkg/storage/object_disk/object_disk.go:361 > /var/lib/clickhouse/preprocessed_configs/config.xml -> //storage_configuration/disks/s3 doesn't contains <access_key_id> and <secret_access_key> environment variables will use
And following error is shown
2024-10-10 21:40:55.236 FTL cmd/clickhouse-backup/main.go:658 > error="one of createBackupLocal go-routine return error: one of uploadObjectDiskParts go-routine return error: b.dst.CopyObject in /var/lib/clickhouse/disks/s3/backup/2024-10-10-full/shadow/signoz_logs/logs/s3 error: S3->CopyObject data2/vkw/nyfgwkxlxfhshaxogyccexradjzrf -> xxxxxxxxxxx/s3/2024-10-10-full/s3/vkw/nyfgwkxlxfhshaxogyccexradjzrf return error: operation error S3: CopyObject, https response error StatusCode: 403, RequestID: DQCDY8K4EBJDPEN6, HostID: t/U9Ut73DraD/sbHxG6xLKitulhU867kZV8TQOxJ4tvWhI7CmlUv62nzRdKVfi9vafyt9p+v4Rs=, api error AccessDenied: Access Denied"
My config
general:
remote_storage: s3
max_file_size: 1073741824
backups_to_keep_local: -1
backups_to_keep_remote: 0
log_level: info
allow_empty_backups: false
download_concurrency: 8
upload_concurrency: 8
download_max_bytes_per_second: 0
upload_max_bytes_per_second: 0
object_disk_server_side_copy_concurrency: 32
allow_object_disk_streaming: false
restore_schema_on_cluster: "cluster"
upload_by_part: true
download_by_part: true
use_resumable_state: true
restore_database_mapping: {}
restore_table_mapping: {}
retries_on_failure: 3
retries_pause: 5s
watch_interval: 1h
full_interval: 24h
watch_backup_name_template: "shard{shard}-{type}-{time:20060102150405}"
sharded_operation_mode: none
cpu_nice_priority: 15
io_nice_priority: "idle"
rbac_backup_always: true
rbac_resolve_conflicts: "recreate"
clickhouse:
username: default
password: ""
host: localhost
port: 9000
skip_tables:
- system.*
- INFORMATION_SCHEMA.*
- information_schema.*
- default.*
timeout: 6h
freeze_by_part: false
freeze_by_part_where: ""
secure: false
skip_verify: true
sync_replicated_tables: true
log_sql_queries: false
debug: false
config_dir: "/etc/clickhouse-server"
ignore_not_exists_error_during_freeze: true
check_replicas_before_attach: true
use_embedded_backup_restore: false
embedded_backup_disk: ""
backup_mutations: true
restore_as_attach: true
check_parts_columns: true
max_connections: 0
s3:
bucket: "xxxxxxxxxxxxx"
endpoint: ""
region: us-east-1
acl: private
assume_role_arn: ""
force_path_style: false
path: ""
object_disk_path: "backups/"
disable_ssl: false
compression_level: 1
compression_format: tar
disable_cert_verification: true
use_custom_storage_class: false
storage_class: STANDARD
concurrency: 1
part_size: 0
max_parts_count: 10000
allow_multipart_download: false
checksum_algorithm: ""
For test purposes I selected just 1 table for backup and it works
- selected 1 table that not contains data on s3 (tiered) - works
- selected 1 table that contains data on s3 (tiered) - works
But when I selected a tables set the AccessDenied is shown
Output of successfully backup (wrote to s3)
chi-signoz-tools-cluster-clickhouse-cluster-0-0-0:~$ ./clickhouse-backup -c config.yml list
backup5 64.95GiB 10/10/2024 18:01:35 remote tar, regular
2024-10-10 10.00GiB 10/10/2024 21:15:30 remote tar, regular
The following selected tables that backup does not working
chi-signoz-tools-cluster-clickhouse-cluster-0-0-0:~$ ./clickhouse-backup -c config.yml tables
signoz_logs.logs 173.72GiB default,s3 full
signoz_traces.signoz_index_v2 160.41GiB default,s3 full
signoz_logs.logs_v2 65.64GiB default full
signoz_traces.durationSort 52.24GiB default,s3 full
signoz_traces.signoz_spans 21.50GiB default,s3 full
signoz_metrics.samples_v2 11.96GiB default full
signoz_metrics.samples_v4 10.01GiB default,s3 full
signoz_metrics.samples_v4_agg_5m 4.71GiB default full
signoz_metrics.samples_v4_agg_30m 1.34GiB default full
signoz_metrics.time_series_v4 1.22GiB default,s3 full
signoz_metrics.time_series_v4_6hrs 1017.92MiB default,s3 full
signoz_metrics.time_series_v4_1day 889.61MiB s3,default full
signoz_metrics.time_series_v2 873.61MiB default full
signoz_metrics.time_series_v4_1week 832.37MiB default full
signoz_traces.span_attributes 775.90MiB default full
signoz_logs.tag_attributes 692.19MiB default full
signoz_traces.dependency_graph_minutes_v2 225.38MiB s3,default full
signoz_traces.dependency_graph_minutes 140.90MiB default full
signoz_traces.signoz_error_index_v2 113.24MiB default,s3 full
signoz_logs.logs_v2_resource 8.15MiB default full
signoz_logs.distributed_logs 1.18MiB default full
signoz_logs.distributed_logs_v2 691.83KiB default full
signoz_metrics.distributed_samples_v4 526.60KiB default full
signoz_logs.distributed_tag_attributes 264.70KiB default full
signoz_metrics.distributed_samples_v2 257.77KiB default full
signoz_analytics.rule_state_history 56.81KiB default full
signoz_traces.usage_explorer 55.57KiB default,s3 full
signoz_logs.distributed_logs_v2_resource 42.04KiB default full
signoz_metrics.distributed_time_series_v4 17.99KiB default full
signoz_metrics.usage 12.06KiB default full
signoz_logs.usage 10.00KiB default full
signoz_traces.usage 9.23KiB default full
signoz_traces.top_level_operations 7.23KiB default full
signoz_metrics.distributed_time_series_v2 5.51KiB default full
signoz_traces.span_attributes_keys 5.37KiB default full
signoz_logs.logs_resource_keys 1.08KiB default full
signoz_traces.schema_migrations 1.00KiB default full
signoz_logs.schema_migrations 719B default full
signoz_logs.logs_attribute_keys 708B default full
signoz_metrics.schema_migrations 598B default full
signoz_logs.resource_keys_string_final_mv 0B default full
signoz_metrics.distributed_samples_v4_agg_30m 0B default full
signoz_metrics.distributed_samples_v4_agg_5m 0B default full
signoz_logs.distributed_usage 0B default full
signoz_metrics.distributed_time_series_v3 0B default full
signoz_logs.distributed_logs_resource_keys 0B default full
signoz_metrics.distributed_time_series_v4_1day 0B default full
signoz_metrics.distributed_time_series_v4_1week 0B default full
signoz_metrics.distributed_time_series_v4_6hrs 0B default full
signoz_metrics.distributed_usage 0B default full
signoz_metrics.exp_hist 0B default full
signoz_logs.distributed_logs_attribute_keys 0B default full
signoz_logs.attribute_keys_string_final_mv 0B default full
signoz_logs.attribute_keys_float64_final_mv 0B default full
signoz_metrics.samples_v4_agg_30m_mv 0B default full
signoz_logs.attribute_keys_bool_final_mv 0B default full
signoz_metrics.samples_v4_agg_5m_mv 0B default full
signoz_analytics.distributed_rule_state_history 0B default full
signoz_metrics.time_series_v3 0B default full
signoz_metrics.time_series_v4_1day_mv 0B s3,default full
signoz_metrics.time_series_v4_1week_mv 0B default full
signoz_metrics.time_series_v4_6hrs_mv 0B s3,default full
signoz_traces.dependency_graph_minutes_db_calls_mv 0B default full
signoz_traces.dependency_graph_minutes_db_calls_mv_v2 0B default,s3 full
signoz_traces.dependency_graph_minutes_messaging_calls_mv 0B default full
signoz_traces.dependency_graph_minutes_messaging_calls_mv_v2 0B default,s3 full
signoz_traces.dependency_graph_minutes_service_calls_mv 0B default full
signoz_traces.dependency_graph_minutes_service_calls_mv_v2 0B s3,default full
signoz_traces.distributed_dependency_graph_minutes 0B default full
signoz_traces.distributed_dependency_graph_minutes_v2 0B default full
signoz_traces.distributed_durationSort 0B default full
signoz_traces.distributed_signoz_error_index_v2 0B default full
signoz_traces.distributed_signoz_index_v2 0B default full
signoz_traces.distributed_signoz_spans 0B default full
signoz_traces.distributed_span_attributes 0B default full
signoz_traces.distributed_span_attributes_keys 0B default full
signoz_traces.distributed_top_level_operations 0B default full
signoz_traces.distributed_usage 0B default full
signoz_traces.distributed_usage_explorer 0B default full
signoz_traces.durationSortMV 0B default,s3 full
signoz_traces.root_operations 0B default full
signoz_traces.signoz_error_index 0B default full
signoz_traces.signoz_index 0B default full
signoz_traces.sub_root_operations 0B default full
signoz_traces.usage_explorer_mv 0B default,s3 full
signoz_metrics.distributed_exp_hist 0B default full
When I selected just signoz_metrics.samples_v4 that containts data on local disk and remote (s3) the backup was successfully.
- I am running clickhouse-backup on same host server
- For tests I set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY env vars but not worked
- I run clickhouse-backup --env S3_ACCESS_KEY=xxx and --env SECRET_KEY=xxx and not worked
Note: my IAM role has full access on s3
Another test
chi-signoz-tools-cluster-clickhouse-cluster-0-0-0:~$ ./clickhouse-backup -c config.yml tables
signoz_metrics.samples_v2 11.97GiB default full
signoz_metrics.samples_v4 10.01GiB default,s3 full
chi-signoz-tools-cluster-clickhouse-cluster-0-0-0:~$ ./clickhouse-backup -c config.yml create_remote partial
2024-10-10 21:55:48.653 INF pkg/backup/create.go:170 > done createBackupRBAC size=0B
2024-10-10 21:55:48.925 WRN pkg/backup/backuper.go:118 > MAX_FILE_SIZE=1073741824 is less than actual 17035327904, please remove general->max_file_size section from your config
2024-10-10 21:55:49.845 INF pkg/backup/create.go:324 > done progress=9/215 table=signoz_metrics.samples_v2
2024-10-10 21:55:50.179 INF pkg/backup/create.go:324 > done progress=10/215 table=signoz_metrics.samples_v4
2024-10-10 21:55:50.197 INF pkg/backup/create.go:336 > done duration=2.128s operation=createBackupLocal version=2.6.2
2024-10-10 21:57:27.083 INF pkg/backup/upload.go:171 > done duration=1m36.326s operation=upload_data progress=2/2 size=10.01GiB table=signoz_metrics.samples_v4 version=2.6.2
2024-10-10 21:57:36.590 INF pkg/backup/upload.go:171 > done duration=1m45.832s operation=upload_data progress=1/2 size=11.97GiB table=signoz_metrics.samples_v2 version=2.6.2
2024-10-10 21:57:36.632 INF pkg/backup/upload.go:240 > done backup=partial duration=1m46.434s object_disk_size=0B operation=upload upload_size=21.98GiB version=2.6.2
2024-10-10 21:57:37.056 INF pkg/backup/delete.go:157 > remove '/var/lib/clickhouse/backup/partial'
2024-10-10 21:57:37.142 INF pkg/backup/delete.go:157 > remove '/var/lib/clickhouse/disks/s3_backup/backup/partial'
2024-10-10 21:57:37.142 INF pkg/backup/delete.go:157 > remove '/var/lib/clickhouse/disks/s3/backup/partial'
2024-10-10 21:57:37.142 INF pkg/backup/delete.go:166 > done backup=partial duration=496ms location=local operation=delete
The previous warning is not shown (the following log is from my previous post)
2024-10-10 21:40:38.196 WRN pkg/storage/object_disk/object_disk.go:361 > /var/lib/clickhouse/preprocessed_configs/config.xml -> //storage_configuration/disks/s3_backup doesn't contains <access_key_id> and <secret_access_key> environment variables will use
2024-10-10 21:40:38.200 WRN pkg/storage/object_disk/object_disk.go:361 > /var/lib/clickhouse/preprocessed_configs/config.xml -> //storage_configuration/disks/s3 doesn't contains <access_key_id> and <secret_access_key> environment variables will use
Thanks for the detailed report
Did you setup AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY inside clickhouse-backup container?
try --env AWS_ACCESS_KEY_ID --env AWS_SECRET_ACCESS_KEY
or --env AWS_ROLE_ARN
Could you share your current pod manifest with replace sensitive credentials to XXX?
kubectl -n <your-namespace> chi-signoz-tools-cluster-clickhouse-cluster-0-0-0 -o yaml
When you use IRSA, which serviceAccount do you use?
In this case, serviceAccount mounts into pod and some environment variables injected into env section.
path: "" object_disk_path: "backups/"
better to replace it
path: "backups"
object_disk_path: "object_disks_backups"
Warning and error will show only if you have data parts in s3 disk
related code fragment https://github.com/Altinity/clickhouse-backup/blob/master/pkg/storage/object_disk/object_disk.go#L354-L367
Did you setup AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY inside clickhouse-backup container?
I running clickhouse-backup bin inside clickhouse-server container. The used service account works for normal clickhouse-server workloads (s3 disk as cold storage) with s3 full access
try --env AWS_ACCESS_KEY_ID --env AWS_SECRET_ACCESS_KEY or --env AWS_ROLE_ARN
I tried but it didnt work
path: "" object_disk_path: "backups/" better to replace it
I changed path but no changes in s3 structure, like the config was ignored
I using SigNoz helm chart with clickhouse dependency 3 shards and 1 replica per shard
Clickhouse pod generated manifest
apiVersion: v1
kind: Pod
metadata:
annotations:
signoz.io/path: /metrics
signoz.io/port: "9363"
signoz.io/scrape: "true"
labels:
app.kubernetes.io/component: clickhouse
app.kubernetes.io/instance: signoz-tools-cluster
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: clickhouse
app.kubernetes.io/version: 24.1.2
apps.kubernetes.io/pod-index: "0"
argocd.argoproj.io/instance: signoz-tools-cluster
clickhouse.altinity.com/app: chop
clickhouse.altinity.com/chi: signoz-tools-cluster-clickhouse
clickhouse.altinity.com/cluster: cluster
clickhouse.altinity.com/namespace: signoz
clickhouse.altinity.com/ready: "yes"
clickhouse.altinity.com/replica: "0"
clickhouse.altinity.com/shard: "0"
helm.sh/chart: clickhouse-24.1.6
statefulset.kubernetes.io/pod-name: chi-signoz-tools-cluster-clickhouse-cluster-0-0-0
name: chi-signoz-tools-cluster-clickhouse-cluster-0-0-0
namespace: signoz
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: chi-signoz-tools-cluster-clickhouse-cluster-0-0
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/component
operator: In
values:
- zookeeper
- clickhouse
topologyKey: kubernetes.io/hostname
containers:
- command:
- /bin/bash
- -c
- /usr/bin/clickhouse-server --config-file=/etc/clickhouse-server/config.xml
env:
- name: AWS_STS_REGIONAL_ENDPOINTS
value: regional
- name: AWS_DEFAULT_REGION
value: us-east-1
- name: AWS_REGION
value: us-east-1
- name: AWS_ROLE_ARN
value: arn:aws:iam::xxxxxxxxxxx:role/ClickhouseEKSRole
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
image: xxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/docker-hub/clickhouse/clickhouse-server:24.1.2-alpine
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 10
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
name: clickhouse
ports:
- containerPort: 8123
name: http
protocol: TCP
- containerPort: 9000
name: client
protocol: TCP
- containerPort: 9009
name: interserver
protocol: TCP
- containerPort: 9000
name: tcp
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /ping
port: http
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "4"
memory: 12Gi
requests:
cpu: "3"
memory: 8Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/clickhouse
name: data-volumeclaim-template
- mountPath: /var/lib/clickhouse/user_scripts
name: shared-binary-volume
- mountPath: /etc/clickhouse-server/functions
name: custom-functions-volume
- mountPath: /etc/clickhouse-server/config.d/
name: chi-signoz-tools-cluster-clickhouse-common-configd
- mountPath: /etc/clickhouse-server/users.d/
name: chi-signoz-tools-cluster-clickhouse-common-usersd
- mountPath: /etc/clickhouse-server/conf.d/
name: chi-signoz-tools-cluster-clickhouse-deploy-confd-cluster-0-0
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-hn6tq
readOnly: true
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
name: aws-iam-token
readOnly: true
initContainers:
- command:
- sh
- -c
- |
set -x
wget -O /tmp/histogramQuantile https://github.com/SigNoz/signoz/raw/develop/deploy/docker/clickhouse-setup/user_scripts/histogramQuantile
mv /tmp/histogramQuantile /var/lib/clickhouse/user_scripts/histogramQuantile
chmod +x /var/lib/clickhouse/user_scripts/histogramQuantile
env:
- name: AWS_STS_REGIONAL_ENDPOINTS
value: regional
- name: AWS_DEFAULT_REGION
value: us-east-1
- name: AWS_REGION
value: us-east-1
- name: AWS_ROLE_ARN
value: arn:aws:iam::xxxxxxxxxxx:role/ClickhouseEKSRole
- name: AWS_WEB_IDENTITY_TOKEN_FILE
value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
image: docker.io/alpine:3.18.2
imagePullPolicy: IfNotPresent
name: signoz-tools-cluster-clickhouse-udf-init
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/clickhouse/user_scripts
name: shared-binary-volume
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-hn6tq
readOnly: true
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
name: aws-iam-token
readOnly: true
nodeSelector:
karpenter.sh/capacity-type: on-demand
karpenter.sh/provisioner-name: observability-stack-provisioner
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 101
fsGroupChangePolicy: OnRootMismatch
runAsGroup: 101
runAsUser: 101
serviceAccount: signoz-tools-cluster-clickhouse
serviceAccountName: signoz-tools-cluster-clickhouse
subdomain: chi-signoz-tools-cluster-clickhouse-cluster-0-0
terminationGracePeriodSeconds: 30
tolerations:
- key: ObservabilityStackOnly
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: aws-iam-token
projected:
defaultMode: 420
sources:
- serviceAccountToken:
audience: sts.amazonaws.com
expirationSeconds: 86400
path: token
- name: data-volumeclaim-template
persistentVolumeClaim:
claimName: data-volumeclaim-template-chi-signoz-tools-cluster-clickhouse-cluster-0-0-0
- emptyDir: {}
name: shared-binary-volume
- configMap:
defaultMode: 420
name: signoz-tools-cluster-clickhouse-custom-functions
name: custom-functions-volume
- configMap:
defaultMode: 420
name: chi-signoz-tools-cluster-clickhouse-common-configd
name: chi-signoz-tools-cluster-clickhouse-common-configd
- configMap:
defaultMode: 420
name: chi-signoz-tools-cluster-clickhouse-common-usersd
name: chi-signoz-tools-cluster-clickhouse-common-usersd
- configMap:
defaultMode: 420
name: chi-signoz-tools-cluster-clickhouse-deploy-confd-cluster-0-0
name: chi-signoz-tools-cluster-clickhouse-deploy-confd-cluster-0-0
- name: kube-api-access-hn6tq
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
I changed /var/lib/clickhouse/preprocessed_configs/config.xml file adding aws credentials and the warning is not shown, but access denied error remains
2024-10-11 14:40:56.684 INF pkg/backup/create.go:170 > done createBackupRBAC size=0B
2024-10-11 14:40:56.735 WRN pkg/backup/backuper.go:118 > MAX_FILE_SIZE=1073741824 is less than actual 17035327904, please remove general->max_file_size section from your config
2024-10-11 14:41:14.253 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: one of uploadObjectDiskParts go-routine return error: b.dst.CopyObject in /var/lib/clickhouse/disks/s3/backup/2024-10-11-remote2/shadow/signoz_logs/logs/s3 error: S3->CopyObject data2/ftx/jovjgrbdopnfqtkvwcgomhssdxifi -> my-bucket/backups/2024-10-11-remote2/s3/ftx/jovjgrbdopnfqtkvwcgomhssdxifi return error: operation error S3: CopyObject, https response error StatusCode: 403, RequestID: 1K78RWBZEA6DMSVK, HostID: MkVUQCZEHvUFrbZAMUM+gn5mZMFuw8tHNmfLmJRMSv256nJiUKzfsiglbhhtgkzKq+bWMqqmPfs=, api error AccessDenied: Access Denied table=signoz_logs.logs
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.signoz_index_v2
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_logs.logs_v2
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.durationSort
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.signoz_spans
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.samples_v2
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.samples_v4
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.samples_v4_agg_5m
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.samples_v4_agg_30m
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.time_series_v4
2024-10-11 14:41:14.254 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.time_series_v4_6hrs
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.time_series_v4_1day
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.time_series_v2
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_logs.tag_attributes
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.time_series_v4_1week
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.span_attributes
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.dependency_graph_minutes_v2
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.dependency_graph_minutes
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.signoz_error_index_v2
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_logs.logs_v2_resource
2024-10-11 14:41:14.255 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_logs.distributed_logs
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.distributed_logs
2024-10-11 14:41:14.255 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_logs.distributed_logs_v2
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.distributed_logs_v2
2024-10-11 14:41:14.255 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_samples_v2
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_samples_v2
2024-10-11 14:41:14.255 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_logs.distributed_tag_attributes
2024-10-11 14:41:14.255 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.distributed_tag_attributes
2024-10-11 14:41:14.256 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_samples_v4
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_samples_v4
2024-10-11 14:41:14.256 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_logs.distributed_logs_v2_resource
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.distributed_logs_v2_resource
2024-10-11 14:41:14.256 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_span_attributes
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_span_attributes
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_analytics.rule_state_history
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.usage_explorer
2024-10-11 14:41:14.256 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_time_series_v2
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_time_series_v2
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.usage
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.usage
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_logs.usage
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.top_level_operations
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.span_attributes_keys
2024-10-11 14:41:14.256 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_time_series_v4
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_time_series_v4
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_logs.logs_attribute_keys
2024-10-11 14:41:14.256 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_logs.logs_resource_keys
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.schema_migrations
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_logs.schema_migrations
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.schema_migrations
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_samples_v4_agg_30m
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_samples_v4_agg_30m
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_samples_v4_agg_5m
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_samples_v4_agg_5m
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.resource_keys_string_final_mv
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_time_series_v3
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_time_series_v3
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_logs.distributed_usage
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.distributed_usage
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_time_series_v4_1day
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_time_series_v4_1day
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_time_series_v4_1week
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_time_series_v4_1week
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_time_series_v4_6hrs
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_time_series_v4_6hrs
2024-10-11 14:41:14.257 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_usage
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_usage
2024-10-11 14:41:14.257 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.exp_hist
2024-10-11 14:41:14.258 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_logs.distributed_logs_resource_keys
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.distributed_logs_resource_keys
2024-10-11 14:41:14.258 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_logs.distributed_logs_attribute_keys
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.distributed_logs_attribute_keys
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.attribute_keys_string_final_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.samples_v4_agg_30m_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.attribute_keys_float64_final_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.samples_v4_agg_5m_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_logs.attribute_keys_bool_final_mv
2024-10-11 14:41:14.258 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_analytics.distributed_rule_state_history
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_analytics.distributed_rule_state_history
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_metrics.time_series_v3
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.time_series_v4_1day_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.time_series_v4_1week_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.time_series_v4_6hrs_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.dependency_graph_minutes_db_calls_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.dependency_graph_minutes_db_calls_mv_v2
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.dependency_graph_minutes_messaging_calls_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.dependency_graph_minutes_messaging_calls_mv_v2
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.dependency_graph_minutes_service_calls_mv
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.dependency_graph_minutes_service_calls_mv_v2
2024-10-11 14:41:14.258 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_dependency_graph_minutes
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_dependency_graph_minutes
2024-10-11 14:41:14.258 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_dependency_graph_minutes_v2
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_dependency_graph_minutes_v2
2024-10-11 14:41:14.258 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_durationSort
2024-10-11 14:41:14.258 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_durationSort
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_signoz_error_index_v2
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_signoz_error_index_v2
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_signoz_index_v2
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_signoz_index_v2
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_signoz_spans
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_signoz_spans
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_span_attributes_keys
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_span_attributes_keys
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_top_level_operations
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_top_level_operations
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_usage
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_usage
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_traces.distributed_usage_explorer
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.distributed_usage_explorer
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.durationSortMV
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.root_operations
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.signoz_error_index
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:278 > b.AddTableToLocalBackup error: context canceled table=signoz_traces.signoz_index
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.sub_root_operations
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_traces.usage_explorer_mv
2024-10-11 14:41:14.259 WRN pkg/backup/create.go:741 > supports only schema backup backup=2024-10-11-remote2 engine=Distributed operation=create table=signoz_metrics.distributed_exp_hist
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:296 > b.ch.GetInProgressMutations error: can't get in progress mutations: context canceled table=signoz_metrics.distributed_exp_hist
2024-10-11 14:41:14.259 ERR pkg/backup/create.go:139 > backup failed error: one of createBackupLocal go-routine return error: one of uploadObjectDiskParts go-routine return error: b.dst.CopyObject in /var/lib/clickhouse/disks/s3/backup/2024-10-11-remote2/shadow/signoz_logs/logs/s3 error: S3->CopyObject data2/ftx/jovjgrbdopnfqtkvwcgomhssdxifi -> my-bucket/backups/2024-10-11-remote2/s3/ftx/jovjgrbdopnfqtkvwcgomhssdxifi return error: operation error S3: CopyObject, https response error StatusCode: 403, RequestID: 1K78RWBZEA6DMSVK, HostID: MkVUQCZEHvUFrbZAMUM+gn5mZMFuw8tHNmfLmJRMSv256nJiUKzfsiglbhhtgkzKq+bWMqqmPfs=, api error AccessDenied: Access Denied
2024-10-11 14:41:14.525 INF pkg/backup/delete.go:185 > cleanBackupObjectDisks deleted 0 keys backup=2024-10-11-remote2 duration=35ms
2024-10-11 14:41:14.525 INF pkg/backup/delete.go:157 > remove '/var/lib/clickhouse/backup/2024-10-11-remote2'
2024-10-11 14:41:14.613 INF pkg/backup/delete.go:157 > remove '/var/lib/clickhouse/disks/s3_backup/backup/2024-10-11-remote2'
2024-10-11 14:41:14.613 INF pkg/backup/delete.go:157 > remove '/var/lib/clickhouse/disks/s3/backup/2024-10-11-remote2'
2024-10-11 14:41:14.618 INF pkg/backup/delete.go:166 > done backup=2024-10-11-remote2 duration=359ms location=local operation=delete
2024-10-11 14:41:14.733 INF pkg/backup/delete.go:43 > /var/lib/clickhouse/shadow
2024-10-11 14:41:14.733 INF pkg/backup/delete.go:43 > /var/lib/clickhouse/disks/s3_backup/shadow
2024-10-11 14:41:14.741 INF pkg/backup/delete.go:43 > /var/lib/clickhouse/disks/s3/shadow
2024-10-11 14:41:14.741 FTL cmd/clickhouse-backup/main.go:658 > error="one of createBackupLocal go-routine return error: one of uploadObjectDiskParts go-routine return error: b.dst.CopyObject in /var/lib/clickhouse/disks/s3/backup/2024-10-11-remote2/shadow/signoz_logs/logs/s3 error: S3->CopyObject data2/ftx/jovjgrbdopnfqtkvwcgomhssdxifi -> my-bucket/backups/2024-10-11-remote2/s3/ftx/jovjgrbdopnfqtkvwcgomhssdxifi return error: operation error S3: CopyObject, https response error StatusCode: 403, RequestID: 1K78RWBZEA6DMSVK, HostID: MkVUQCZEHvUFrbZAMUM+gn5mZMFuw8tHNmfLmJRMSv256nJiUKzfsiglbhhtgkzKq+bWMqqmPfs=, api error AccessDenied: Access Denied"
Note: IAM Role has s3 full access and the AWS credentials is for my aws user (admin access)
See the code block here
Is the srcBucket variable empty?
Compare the output log
error="one of createBackupLocal go-routine return error: one of uploadObjectDiskParts go-routine return error: b.dst.CopyObject in /var/lib/clickhouse/disks/s3/backup/2024-10-11-remote2/shadow/signoz_logs/logs/s3 error: S3->CopyObject data2/rky/guvjhazneieklouevfhijqiaduqlk -> my-bucket/backups/2024-10-11-remote2/s3/rky/guvjhazneieklouevfhijqiaduqlk return error: operation error S3: CopyObject, https response error StatusCode: 403, RequestID: AS1YYQCZF4KJ8QZY, HostID: /8ntBN2alKtBkXTy9YcODvCAnEb/bDf8KbJH1mOL0OlTJwChCkH3bysFHih4k9x+cVqKOST3Pd0=, api error AccessDenied: Access Denied"
S3->CopyObject data2/rky/guvjhazneieklouevfhijqiaduqlk -> my-bucket/backups/2024-10-11-remote2/s3/rky/guvjhazneieklouevfhijqiaduqlk
/\
||
The log shown only key but not the source bucket
The s3 logs
2024-10-11 15:16:27.034 INF pkg/storage/s3.go:49 > [s3:DEBUG] Request
GET /?versioning= HTTP/1.1
Host: data2.s3.xxxxxxxxxxxxx-cold-storage-tools.amazonaws.com
User-Agent: m/F aws-sdk-go-v2/1.30.5 os/linux lang/go#1.22.7 md/GOOS#linux md/GOARCH#arm64 api/s3#1.61.2
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: 338f4e3a-00cd-4f38-a096-ea36114e0b97
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=**********/20241011/xxxxxxxxxxxxx-cold-storage-tools/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date, Signature=xxxxxxxxxxx
X-Amz-Content-Sha256: xxxxxxxxxxx
X-Amz-Date: 20241011T151627Z
2024-10-11 15:16:27.051 INF pkg/storage/s3.go:49 > [s3:DEBUG] request failed with unretryable error https response error StatusCode: 0, RequestID: , HostID: , request send failed, Get "https://data2.s3.xxxxxxxxxxxxx-cold-storage-tools.amazonaws.com/?versioning=": dial tcp: lookup data2.s3.xxxxxxxxxxxxx-cold-storage-tools.amazonaws.com on 10.205.0.10:53: no such host
2024-10-11 15:16:27.071 INF pkg/storage/s3.go:49 > [s3:DEBUG] Request
PUT /backups/2024-10-11-remote2/s3/rky/guvjhazneieklouevfhijqiaduqlk?x-id=CopyObject HTTP/1.1
Host: xxxxxxxxxxxxx-backup-tools.s3.us-east-1.amazonaws.com
User-Agent: m/F aws-sdk-go-v2/1.30.5 os/linux lang/go#1.22.7 md/GOOS#linux md/GOARCH#arm64 api/s3#1.61.2
Content-Length: 0
Accept-Encoding: identity
Amz-Sdk-Invocation-Id: 3736572a-701b-4537-978c-7d8d3b1d54e5
Amz-Sdk-Request: attempt=1; max=3
Authorization: AWS4-HMAC-SHA256 Credential=**********/20241011/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-security-token;x-amz-storage-class, Signature=xxxxxxxxxxx
X-Amz-Content-Sha256: xxxxxxxxxxx
X-Amz-Copy-Source: data2/rky/guvjhazneieklouevfhijqiaduqlk
X-Amz-Date: 20241011T151627Z
X-Amz-Security-Token: xxxxxxxxxxx
X-Amz-Storage-Class: STANDARD
The bucket key path (data2/) is inside s3 host Host: data2.s3.xxxxxxxxxxxxx-cold-storage-tools.amazonaws.com
Is correct??
xxxxxxxxxxxxx-cold-storage-tools is the s3 disk set in config.xml
The s3 endpoint string format was wrong.
I changed https://xxxxxx-storage-test-tools.s3.amazonaws.com/data/ to https://xxxxxx-storage-test-tools.s3.us-east-1.amazonaws.com/data/ and work
Is image: xxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/docker-hub/clickhouse/clickhouse-server:24.1.2-alpine contains clickhouse-backup binary?
No. I installed clickhouse-backup bin manually in the container
Unfortunatelly, https://github.com/SigNoz/charts/blob/main/charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml#L202 doesn't allow run second container with clickhouse-backup
in this case i would like to propose use standard BACKUP and RESTORE sql commands which available with modern clickhouse-server version
look details in https://clickhouse.com/docs/en/operations/backup
you can just create kind: CronJob which will just execute something like
clickhouse-client -h chi...-0-0 --user ... --password ... -q "BACKUP ALL ON CLUSTER '{cluster}' TO S3(...)"
and for restore kind: Job which will just execute something like
clickhouse-client -h chi...-0-0-0 --user ... --password ... -q "BACKUP ALL ON CLUSTER '{cluster}' TO S3(...)"
I think to fork the chart and customize to provide side cars containers to clickhouse-server.
About embbeded backup suggestions, I tried but the backup fails for clustered workload and I use clickhouse-backup for this.
Another option is run a cronjob that connects to clickhouse-server pod through kubectl command and runs backup in it.
which failure do you have with BACKUP ALL ON CLUSTER ?
did you check SELECT * FROM system.backup_log?
When backing up using ON CLUSTER flag I must do many parts synchronization and we do not have deep knowledge about this. We are new clickhouse users and we are learning about it. clickhouse-backup has deep managing features and I prefer it.
Before clickhouse BACKUP/RESTORE features, we used velero. But on recovery steps, we have too many parts and other errors to handle.
I think, too many parts is not related to used backup tool ;)
this is usually related to wrong INSERT pattern and insert rows batch size which produces a lot of small data parts
BACKUP ALL .. ON CLUSTER should work very well in clickhouse-server:24.8
on cluster means upload parts to s3 will just spread between replicas inside shard not so much parts sync as you think
Example error
Received exception from server (version 24.1.2):
Code: 647. DB::Exception: Received from localhost:9000. DB::Exception: Got error from chi%2Dsignoz%2Dtools%2Dcluster%2Dclickhouse%2Dcluster%2D2%2D0:9000. DB::Exception: Table signoz_logs.logs_v2 on replica chi-signoz-tools-cluster-clickhouse-cluster-0-0 has part 20240927_1_1_0 different from the part on replica chi-signoz-tools-cluster-clickhouse-cluster-2-0 (checksum '5d2c4cb2a3959b040da2e13c398090fb' on replica chi-signoz-tools-cluster-clickhouse-cluster-0-0 != checksum 'a234ebfd4d43dbb6639eccbb5e286882' on replica chi-signoz-tools-cluster-clickhouse-cluster-2-0). (CANNOT_BACKUP_TABLE)
When we changed from 1 shard to 2 shards, the organic replication was used and no manually steps was did. I dont know if later steps are necessary.
has part 20240927_1_1_0 different from the part on replica When we changed from 1 shard to 2 shards, the organic replication was used
Hm, could you share
SELECT hostName(), engine_full FROM cluster('all-sharded',system.tables) WHERE database='signoz_logs' AND table='logs_v2'
To fix your issue, i would like propse to run
kubectl exec chi-signoz-tools-cluster-clickhouse-cluster-0-0 -- clickhouse-client --receive-timeout=86400 -q "OPTIMIZE TABLE signoz_logs.logs_v2 PARTITION 20240927 FINAL"
and try BACKUP again
I ran the OPTIMIZE TABLE command for above partition and for each BACKUP execution I needed run the OPTIMIZE TABLE command for that partition. Finally, I ran for all partitions (no PARTITION arg in OPTIMIZE command), but still mismatch part error is shown.
Curious is that for some OPTIMIZE executions some errors were shown:
Code: 53. DB::Exception: Received from localhost:9000. DB::Exception: There was an error on [chi-signoz-tools-cluster-clickhouse-cluster-2-0:9000]: Code: 53. DB::Exception: Type mismatch in IN or VALUES section. Expected: Date. Got: Float64. (TYPE_MISMATCH) (version 24.1.2.5 (official build)). (TYPE_MISMATCH)
Code: 53. DB::Exception: Received from localhost:9000. DB::Exception: Type mismatch in IN or VALUES section. Expected: Date. Got: Float64. (TYPE_MISMATCH)
Hm, could you share
SELECT hostName(), engine_full FROM cluster('all-sharded',system.tables) WHERE database='signoz_logs' AND table='logs_v2'
┌─hostName()────────────────────────────────────────┬─engine_full────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ chi-signoz-tools-cluster-clickhouse-cluster-0-0-0 │ ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') PARTITION BY toDate(timestamp / 1000000000) ORDER BY (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id) TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(1296000) SETTINGS ttl_only_drop_parts = 1, index_granularity = 8192 │
└───────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
┌─hostName()────────────────────────────────────────┬─engine_full────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ chi-signoz-tools-cluster-clickhouse-cluster-2-0-0 │ ReplicatedMergeTree('/clickhouse/tables/c111787f-3753-4163-936e-89c8ffca0867/{shard}', '{replica}') PARTITION BY toDate(timestamp / 1000000000) ORDER BY (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id) TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(1296000) SETTINGS ttl_only_drop_parts = 1, index_granularity = 8192 │
└───────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
┌─hostName()────────────────────────────────────────┬─engine_full────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ chi-signoz-tools-cluster-clickhouse-cluster-1-0-0 │ ReplicatedMergeTree('/clickhouse/tables/c111787f-3753-4163-936e-89c8ffca0867/{shard}', '{replica}') PARTITION BY toDate(timestamp / 1000000000) ORDER BY (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id) TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(1296000) SETTINGS ttl_only_drop_parts = 1, index_granularity = 8192 │
└───────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Did you receive errors above when executing BACKUP command or something else?
Could you share full stacktrace in this case?
Moreover, les's compare uuid
SELECT hostName(), uuid, engine_full FROM cluster('all-sharded',system.tables) WHERE database='signoz_logs' AND table='logs_v2'
upgrade your clickhouse-server version to 24.8
SELECT
hostName(),
uuid,
engine_full
FROM cluster('all-sharded', system.tables)
WHERE (database = 'signoz_logs') AND (table = 'logs_v2')
┌─hostName()────────────────────────────────────────┬─uuid─────────────────────────────────┬─engine_full────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ chi-signoz-tools-cluster-clickhouse-cluster-0-0-0 │ c111787f-3753-4163-936e-89c8ffca0867 │ ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}') PARTITION BY toDate(timestamp / 1000000000) ORDER BY (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id) TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(1296000) SETTINGS ttl_only_drop_parts = 1, index_granularity = 8192 │
└───────────────────────────────────────────────────┴──────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
┌─hostName()────────────────────────────────────────┬─uuid─────────────────────────────────┬─engine_full────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ chi-signoz-tools-cluster-clickhouse-cluster-2-0-0 │ c111787f-3753-4163-936e-89c8ffca0867 │ ReplicatedMergeTree('/clickhouse/tables/c111787f-3753-4163-936e-89c8ffca0867/{shard}', '{replica}') PARTITION BY toDate(timestamp / 1000000000) ORDER BY (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id) TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(1296000) SETTINGS ttl_only_drop_parts = 1, index_granularity = 8192 │
└───────────────────────────────────────────────────┴──────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
┌─hostName()────────────────────────────────────────┬─uuid─────────────────────────────────┬─engine_full────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ chi-signoz-tools-cluster-clickhouse-cluster-1-0-0 │ c111787f-3753-4163-936e-89c8ffca0867 │ ReplicatedMergeTree('/clickhouse/tables/c111787f-3753-4163-936e-89c8ffca0867/{shard}', '{replica}') PARTITION BY toDate(timestamp / 1000000000) ORDER BY (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id) TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(1296000) SETTINGS ttl_only_drop_parts = 1, index_granularity = 8192 │
└───────────────────────────────────────────────────┴──────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
yes, the above shown error came from BACKUP command
chi-signoz-tools-cluster-clickhouse-cluster-0-0-0.chi-signoz-tools-cluster-clickhouse-cluster-0-0.signoz.svc.cluster.local :) BACKUP ALL ON CLUSTER 'cluster' TO S3('https://xxxxxxxxxx-backup-tools.s3.us-east-1.amazonaws.com/EMBED_BACKUP/')
BACKUP ALL ON CLUSTER cluster TO S3('https://xxxxxxxxxx-backup-tools.s3.us-east-1.amazonaws.com/EMBED_BACKUP/')
Query id: 8bce120e-15cc-4051-ae34-21c8fe3adf6a
Elapsed: 7.454 sec.
Received exception from server (version 24.1.2):
Code: 647. DB::Exception: Received from localhost:9000. DB::Exception: Got error from chi%2Dsignoz%2Dtools%2Dcluster%2Dclickhouse%2Dcluster%2D1%2D0:9000. DB::Exception: Table signoz_logs.logs_v2 on replica chi-signoz-tools-cluster-clickhouse-cluster-1-0 has part 20240929_2_2_0 different from the part on replica chi-signoz-tools-cluster-clickhouse-cluster-2-0 (checksum 'b87066065558b8e0f1790072f9d48853' on replica chi-signoz-tools-cluster-clickhouse-cluster-1-0 != checksum '80a4583914d9af71921f01fa326978ab' on replica chi-signoz-tools-cluster-clickhouse-cluster-2-0). (CANNOT_BACKUP_TABLE)
upgrade your clickhouse-server version to 24.8
What is the motivation for?
ok. uuid the same, so replication works
let's check how many parts have the same name but different hashes
SELECT groupArray(h) AS all_hosts, name, database, table, groupArray(hash_of_all_files) AS all_hashes FROM (
SELECT hostName() h, name, database, table, hash_of_all_files FROM cluster('all-sharded',system.parts) WHERE engine ILIKE '%Replicated%'`
)
GROUP BY name, database, table
HAVING length(all_hashes) > 1
upgrade your clickhouse-server version to 24.8 What is the motivation for?
this is LTS release, hope it have more stable implementation for BACKUP
moreover, let's apply
OPTIMIZE TABLE signoz_logs.logs_v2 ON CLUSTER '{cluster}' FINAL ?
did you check mutation finished successful via
SELECT hostName(), * FROM cluster('all-sharded',system.mutations) WHERE query ILIKE '%OPTIMIZE%FINAL%' FORMAT Vertical
I'll close the issue because the initial problem was solved.
I using the clickhouse-backup instead of embeded backup.