postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

CrunchyData "repo" (pgbackrest) instance not using serviceAccount. (Permissions in the namespace's "default" serviceAccount affect deployment)

Open nnachefski opened this issue 3 years ago • 7 comments

I created a DB using CrunchyData (named "tracking"), but i also have "anyuid" policy set for the project's 'default' ServiceAccount. The initContainer ("pgbackrest-log-dir") in the "tracking-repo-host" StatefulSet failed to deploy citing:

mkdir: cannot create directory ‘/pgbackrest/repo1/log’: Permission denied

# oc get sts tracking-repo-host -n dev -o yaml |grep serviceAccount
<nothing>
# oc get sa |grep tracking
tracking-instance          1         142m
tracking-pgbackrest        1         142m     <----- shouldnt the sts being using this SA and not 'default' ?

If i remove the 'anyuid' ClusterRoleBinding from the 'default' serviceAccount and try again it works fine.

-Nick

nnachefski avatar Nov 22 '22 23:11 nnachefski

Which version of PGO and OpenShift are you using?

cbandy avatar Nov 28 '22 17:11 cbandy

We are experiences the same on multiple our of OKD clusters (OKD 4.9, 4.10, 4.11) and PGO 5.x.x. But we can limit the issue scope, this only happens when you run multiple services which are using the "default" namespace. For any reason if installing PGO in a naked namespace, then the default serviceAccount works.

The (manual) solution to run it with other services in the same namespace, is to set the serviceAccount to the generated on of PGO in the "repo-host" statefullset.

dan1el-k avatar Dec 22 '22 12:12 dan1el-k

Hi!

I'm experiencing the same issue. The postgresCluster CR has no property for setting serviceAccount for pgbackrest. So I have to assing SCCs to the default serviceaccount. Running OKD 4.11 operator 5.3.0

joyartoun avatar Feb 03 '23 07:02 joyartoun

This issue is still happening on OKD 4.12 with CrunchyData 5.3.0

The problem manifests itself in the pgbackrest-log-dir initContainer.

Here is the work-around for now: (change the sts and serviceAccount name to whatever your's is called)

oc patch sts airflow-repo-host --type=merge -p '{"spec":{"template":{"spec":{"initContainers":[{"name":"pgbackrest-log-dir"}],"serviceAccountName":"airflow-pgbackrest"}}}}'

nnachefski avatar May 12 '23 06:05 nnachefski

Thank you @nnachefski, it helped me a lot. The pod can work, backups are fine, but it cannot write logs, because the openshift uid doesn't have write access to the log dir: sh-4.4$ ls -la /pgbackrest/repo1/log/ total 0 drwxr-xr-x. 2 26 26 0 Jun 5 10:06 .

I think uid of postgres user is 26 in the image, but we use openshift uid here.

I just wanted to highlight this, if someone like me find this issue and WA. I hope Crunchy will fix this soon.

douggutaby avatar Jun 07 '23 09:06 douggutaby

I am having a question related to the use of this service account. The repo-host pod is using the default service account in my case and I am getting the error

option 'repo1-s3-key-type' is 'web-id' but 'AWS_ROLE_ARN' and 'AWS_WEB_IDENTITY_TOKEN_FILE' are not set

I believe it has to do with the pod using the default service account whilst the other pod is using the helix-instance service account (my Postgres cluster is called helix) which does have the AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE. Is this pod meant to be using the default service account or the helix-instance service account like the other pod. I am using AWS EKS trying to backup to AWS s3. Please help me

loydbanks avatar Apr 24 '24 19:04 loydbanks