operator-controller icon indicating copy to clipboard operation
operator-controller copied to clipboard

Bundle openshift-gitops-operator fails on RBAC trying to set an ownerRef on the CRD

Open itroyano opened this issue 1 year ago • 8 comments

Version: OpenShift 4.16 tech preview (enabled by feature gate)

$ oc -n openshift-gitops get bundledeployments.core.rukpak.io  openshift-gitops-operator -o yaml
apiVersion: core.rukpak.io/v1alpha2
kind: BundleDeployment
metadata:
  creationTimestamp: "2024-08-26T13:03:27Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2024-08-26T13:05:55Z"
  finalizers:
  - core.rukpak.io/delete-cached-bundle
  generation: 2
  name: openshift-gitops-operator
  ownerReferences:
  - apiVersion: olm.operatorframework.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: ClusterExtension
    name: openshift-gitops-operator
    uid: 4f4b94de-3197-40cc-b564-188067006883
  resourceVersion: "57645"
  uid: 74fa0df1-1cea-4771-9769-631e174cf090
spec:
  installNamespace: openshift-gitops
  provisionerClassName: core-rukpak-io-registry
  source:
    image:
      ref: registry.redhat.io/openshift-gitops-1/gitops-operator-bundle@sha256:a782f27b301fd2c06e94125cca735590a21d87cbe18cf15f06197679462bb65d
    type: image
status:
  conditions:
  - lastTransitionTime: "2024-08-26T13:03:31Z"
    message: Successfully unpacked the image Bundle
    reason: UnpackSuccessful
    status: "True"
    type: Unpacked
  - lastTransitionTime: "2024-08-26T13:03:34Z"
    message: 'cannot patch "argocds.argoproj.io" with kind CustomResourceDefinition:
      CustomResourceDefinition.apiextensions.k8s.io "argocds.argoproj.io" is invalid:
      metadata.ownerReferences: Invalid value: []v1.OwnerReference{v1.OwnerReference{APIVersion:"core.rukpak.io/v1alpha2",
      Kind:"BundleDeployment", Name:"openshift-gitops-operator", UID:"74fa0df1-1cea-4771-9769-631e174cf090",
      Controller:(*bool)(0xc01cd9c788), BlockOwnerDeletion:(*bool)(0xc01cd9c789)},
      v1.OwnerReference{APIVersion:"core.rukpak.io/v1alpha2", Kind:"BundleDeployment",
      Name:"openshift-gitops-operator", UID:"cd53d7b4-f7d3-4cca-b78f-6b128d7b4b27",
      Controller:(*bool)(0xc01cd9c78a), BlockOwnerDeletion:(*bool)(0xc01cd9c78b)}}:
      Only one reference can have Controller set to true. Found "true" in references
      for BundleDeployment/openshift-gitops-operator and BundleDeployment/openshift-gitops-operator'
    reason: InstallFailed
    status: "False"
    type: Installed
  - lastTransitionTime: "2024-08-26T13:03:34Z"
    message: Installed condition is false
    reason: InstallationStatusFalse
    status: "False"
    type: Healthy
  contentURL: https://core.openshift-rukpak.svc/bundles/openshift-gitops-operator.tgz
  observedGeneration: 2
  resolvedSource:
    image:
      ref: registry.redhat.io/openshift-gitops-1/gitops-operator-bundle@sha256:a782f27b301fd2c06e94125cca735590a21d87cbe18cf15f06197679462bb65d
    type: image

itroyano avatar Aug 26 '24 13:08 itroyano

@itroyano based on the error message in the Installed status condition, I suspect that argocd itself has already been installed and the CRDs are also being managed by another controller.

it also looks like the version of OLM 1.0 shipped with the 4.16 TP is using RukPak and is a bit outdated compared to the current state of this project.

everettraven avatar Aug 26 '24 13:08 everettraven

Regardless though, it seems the crux of this issue is that multiple instance of extensions are attempting to create and manage the same resources which will not be supported.

everettraven avatar Aug 26 '24 13:08 everettraven

Makes sense the duplication sounds weird

Only one reference can have Controller set to true. Found "true" in references
      for BundleDeployment/openshift-gitops-operator and BundleDeployment/openshift-gitops-operator'

itroyano avatar Aug 26 '24 13:08 itroyano

Yeah, I think long-term we need to make sure that we have a more descriptive error message for this scenario

everettraven avatar Aug 26 '24 13:08 everettraven

That one seems to be thrown by RBAC not us, right? we could check for a duplication earlier as part of the multiple-instances epic https://github.com/operator-framework/operator-controller/issues/736

itroyano avatar Aug 26 '24 13:08 itroyano

@itroyano, this sounds like an from helm-operator-plugins that adds owner references to managed objects. I noticed similar duplication when I was implementing and testing the chunked release secret driver, and I think it should be fixed in both helm-operator-plugins and operator-controller main branches.

joelanford avatar Aug 26 '24 14:08 joelanford

Can we prevent the installation of an operator with v1, in case we detect v0 already has it installed?

itroyano avatar Aug 26 '24 14:08 itroyano

I think that is jumping to the conclusion that this issue is caused by duplicate installations. What I saw (and believe I fixed) was an issue where the ownerref-injecting client had a bug and would inject two different ownerrefs for the same parent object.

And that's what this looks like as well:

Found "true" in references for BundleDeployment/openshift-gitops-operator and BundleDeployment/openshift-gitops-operator'

joelanford avatar Aug 26 '24 14:08 joelanford

Issues go stale after 90 days of inactivity. If there is no further activity, the issue will be closed in another 30 days.

github-actions[bot] avatar Aug 03 '25 01:08 github-actions[bot]

This issue has been closed due to inactivity.

github-actions[bot] avatar Sep 03 '25 01:09 github-actions[bot]