postgres-operator-examples icon indicating copy to clipboard operation
postgres-operator-examples copied to clipboard

Master or Replica database pod crash with no infos during kubectl cp on /tmp

Open toolib-kent opened this issue 3 years ago • 1 comments

We are facing some issues on Postgres Operator Helm chart utilization

PGO version : v5.1.1 and 5.1.2 Install Chart version : v3.2.0 PostgresCluster version : v.3.5.0 Cluster : AWS EKS kubernetes cluster v 1.22.9

During file transfer between local host and database pods in /tmp, or when we import a database dump via psql from a sql dump stored in /tmp, the pod crash and is recreated, with apparently no error logs...

StatefulSet <namespace>/<release-name>-prod-cluster-ha-6d2k is recreating failed Pod toolib-prod-cluster-ha-6d2k-0

The containers used in the pods :

registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-14.4-3.1-0
registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-2
registry.developers.crunchydata.com/crunchydata/crunchy-postgres-exporter:ubi8-5.1.2-0

Volumes currently mounted :

/pgconf/tls
from cert-volume (ro)
/pgdata
from postgres-data (rw)
/tmp
from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount
from kube-api-access-cn92c (ro)

I guess the previous /tmp volume is the AWS node storage mounted on the pod

We use some basic postgis configuration and values here. We tried to remove patroni configuration and pass the PV and PVC in ReadWriteMany access mode in NFS

toolib-kent avatar Aug 04 '22 07:08 toolib-kent

(1) Are you still experiencing this problem?

(2) When the pod fails, have you been able to describe the pod to see what the exit code was? (I wonder if the kubectl cp is somehow OOMkilling the pod...)

(3) Taking a step back, what exactly is the desired result here? It sounds to me like you're saying you start up a postgis cluster, then are trying to load some info into the pod (or try to pull some data from the pod). I haven't run into the problem with the pod restarting, but then I've only kubectl cp smaller files. Might there be a workaround where you pre-create the data volume with the info you want and then pass that volume in to the postgres cluster? I mean: maybe kubectl cp is the problem here and there's some way to work around that.

ETA:(4) just to check, what is the size of the files you're trying to move around and what're the resource limits on the pods?

benjaminjb avatar Oct 13 '22 21:10 benjaminjb

Hi @toolib-kent

We hope that the above suggestions from Ben have helped you, if you continue to have issues feel free to create a new issue or re-open this one.

ValClarkson avatar Oct 28 '22 19:10 ValClarkson