Unable to update pgBackRest configuration from single local volume repo to multirepo - local and s3 (MinIO)
Overview
The first deploy of the Postgres cluster I did was an installation through helm that created the cluster without any backups at all. Now that I got the time to improve the deployment, I tried to use the multirepo backup strategy, where I would have backups both locally stored in Kubernetes volumes (backed by rook-ceph) and in MinIO, external to the cluster.
The issue happens when I upgrade the postgrescluster CR configuration, adding the second repo (repo2) with MinIO (s3) configuration. To be clearer, I have a Postgres cluster with no backup configured. If I add only local volumes for backups, the configuration change is taken as expected. If I try to add both local and s3, the operator complains that the pgBackRest configuration changed and there is a hash mismatch. On the other hand, if I create the cluster with multirepo (same config attempted before) from the scratch, not an update in-place, it does not complain.
Environment
Please provide the following details:
- Platform: Rancher
- Platform Version: 1.22.9
- PGO Image Tag: ubi8-5.1.1-0
- Postgres Version 14
- Storage: rook-ceph storageclass
Steps to Reproduce
Install cluster using helm (pgo then postgres) without pgBackRest configuration Install cluster using helm (pgo then postgres, although pgo won't change) providing pgBackRest configuration for both local and s3 (MinIO) repos. Operator will throw:
time="2022-07-25T19:47:28Z" level=info msg="pgBackRest config hash mismatch detected, requeuing to reattempt stanza create" file="internal/controller/postgrescluster/pgbackrest.go:1328" func="postgrescluster.(*Reconciler).reconcilePGBackRest" name=postgres namespace=namespacename reconciler=pgBackRest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.1-0
EXPECTED
Operator would not complain and backup would start automatically, as it happens with a single repo configuration.
ACTUAL
stanza create re queued and remains trying
Logs
Operator log:
time="2022-07-25T19:47:28Z" level=info msg="pgBackRest config hash mismatch detected, requeuing to reattempt stanza create" file="internal/controller/postgrescluster/pgbackrest.go:1328" func="postgrescluster.(*Reconciler).reconcilePGBackRest" name=postgres namespace=namespacename reconciler=pgBackRest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.1-0
pgo helm values:
singleNamespace: true
debug: false
imagePullSecretNames:
- "secret-x"
controllerImages:
cluster: "registry.developers.crunchydata.com/crunchydata/postgres-operator:ubi8-5.1.1-0"
upgrade: "registry.developers.crunchydata.com/crunchydata/postgres-operator-upgrade:ubi8-5.1.1-0"
# relatedImages are used when an image is omitted from PostgresCluster or PGUpgrade specs.
relatedImages:
postgres_14:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.3-0"
postgres_14_gis_3.1:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-14.3-3.1-0"
postgres_13:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-13.7-0"
postgres_13_gis_3.1:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-13.7-3.1-0"
pgadmin:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-pgadmin4:ubi8-4.30-1"
pgbackrest:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-1"
pgbouncer:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.16-3"
pgexporter:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.1.1-0"
pgupgrade:
image: "registry.developers.crunchydata.com/crunchydata/crunchy-upgrade:ubi8-5.1.1-0"
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
Working postgrescluster CR helm values:
name: "postgres"
postgresVersion: 14
postGISVersion: 3.1
monitoring: true
imagePostgres: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-14.3-3.1-0"
imagePgBackRest: "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-1"
imagePgBouncer: "registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.16-3"
imageExporter: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.1.1-0"
imagePullSecrets:
- name: "secret-x"
instanceName: postgres-instance
instanceSize: 30Gi
patroni:
dynamicConfiguration:
postgresql:
pg_hba:
- "hostnossl all all all md5" # Required to allow non-tls connections
parameters:
shared_buffers: 3140MB
work_mem: 4MB
maintenance_work_mem: 1570MB
max_connections: 200
effective_io_concurrency: 200
max_worker_processes: 19
max_parallel_workers_per_gather: 4
wal_buffers: 16MB
max_wal_size: 1GB
min_wal_size: 512MB
random_page_cost: 1.1
effective_cache_size: 9421MB
default_statistics_target: 500
log_timezone: 'UTC'
ssl: "off"
autovacuum_max_workers: 10
autovacuum_naptime: 10
datestyle: 'iso, mdy'
max_locks_per_transaction: 128
shared_preload_libraries: timescaledb
timescaledb.telemetry_level: basic
databaseInitSQL:
name: bootstrap-sql
key: sql
pgBouncerConfig:
config:
global:
max_client_conn: "200"
server_tls_sslmode: disable # Required to allow non-tls connections
users:
- name: postgres
password:
type: AlphaNumeric
openshift: false
instanceCPU: 1
instanceMemory: 2Gi
pgBackRestConfig:
repos:
- name: repo1
schedules:
full: "0 0 * * 6" # at each Saturday
incremental: "*/30 * * * *" # at each 30m
volume:
volumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 35Gi
global:
repo1-retention-full: "14"
repo1-retention-full-type: time
not working postgrescluster CR helm values:
name: "postgres"
postgresVersion: 14
postGISVersion: 3.1
monitoring: true
imagePostgres: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-14.3-3.1-0"
imagePgBackRest: "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-1"
imagePgBouncer: "registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:ubi8-1.16-3"
imageExporter: "registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.1.1-0"
imagePullSecrets:
- name: "secret-x"
instanceName: postgres-instance
instanceSize: 30Gi
patroni:
dynamicConfiguration:
postgresql:
pg_hba:
- "hostnossl all all all md5" # Required to allow non-tls connections
parameters:
shared_buffers: 3140MB
work_mem: 4MB
maintenance_work_mem: 1570MB
max_connections: 200
effective_io_concurrency: 200
max_worker_processes: 19
max_parallel_workers_per_gather: 4
wal_buffers: 16MB
max_wal_size: 1GB
min_wal_size: 512MB
random_page_cost: 1.1
effective_cache_size: 9421MB
default_statistics_target: 500
log_timezone: 'UTC'
ssl: "off"
autovacuum_max_workers: 10
autovacuum_naptime: 10
datestyle: 'iso, mdy'
max_locks_per_transaction: 128
shared_preload_libraries: timescaledb
timescaledb.telemetry_level: basic
databaseInitSQL:
name: bootstrap-sql
key: sql
pgBouncerConfig:
config:
global:
max_client_conn: "200"
server_tls_sslmode: disable # Required to allow non-tls connections
users:
- name: postgres
password:
type: AlphaNumeric
openshift: false
instanceCPU: 1
instanceMemory: 2Gi
pgBackRestConfig:
repos:
- name: repo1
schedules:
full: "0 0 * * 6"
incremental: "0 */12 * * *"
volume:
volumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 35Gi
- name: repo2
s3:
bucket: "postgres-backups"
endpoint: "10.0.0.6:9000"
region: "notusedbyminio"
global:
repo1-retention-full: "14"
repo1-retention-full-type: time
repo2-s3-uri-style: path
repo2-storage-verify-tls: "n"
repo2-storage-port: "9000"
repo2-s3-key: "xxxxxxxxx"
repo2-s3-key-secret: "yyyyyyy"
Hello,
Do we have any suggestions?
Hey @rmiguelac, that warning is expected when we run into an issue with stanza-create. As the cluster continues to reconcile, it should get into a better state and be able to create the stanza.
I replicated that log message by adding a new cloud backup repo in addition to an initial PV-based repo. The warning showed up a handful of times, and then I finally saw this message:
time="2022-08-23T14:08:20Z" level=debug msg=Normal file="sigs.k8s.io/[email protected]/pkg/internal/recorder/recorder.go:98" func="recorder.(*Provider).getBroadcaster.func1.1" message="pgBackRest stanza creation completed successfully" object="{PostgresCluster postgres-operator test-1 00462c52-5f3d-488f-ab0f-862c9394ea17 postgres-operator.crunchydata.com/v1beta1 3027734 }" reason=StanzasCreated version=latest
What does your postgrescluster status look like? It should show if stanzas have been created for each repo.
status:
conditions:
- lastTransitionTime: "2022-08-23T18:49:18Z"
message: pgBackRest replica create repo is ready for backups
observedGeneration: 2
reason: StanzaCreated
status: "True"
type: PGBackRestReplicaRepoReady
pgbackrest:
repos:
- bound: true
name: repo1
replicaCreateBackupComplete: true
stanzaCreated: true
volume: pvc-f4e97143-b710-4b8e-9d85-c724099f90cc
- name: repo4
repoOptionsHash: 6d6c5d6679
stanzaCreated: true