redpanda-operator icon indicating copy to clipboard operation
redpanda-operator copied to clipboard

Installing a Redpanda cluster with the same name as the Operator breaks the Operator

Open voutilad opened this issue 1 year ago • 6 comments

Found this using Terraform, but that seems unrelated.

Install an Operator instance using Helm with the name "redpanda" in a namespace of your choosing.

Create a Redpanda CR with the name "redpanda" in the same namespace and apply it.

The Operator will trash its own Helm release, removing critical resources like Roles.

JIRA Link: K8S-207

voutilad avatar Apr 01 '24 14:04 voutilad

I've confirmed things still break when not using Terraform and the helm_release resource from the helm provider, but things seem to just never deploy. Once the Operator is installed with the name redpanda it's unable to deploy any clusters regardless of the name of the Redpanda CR.

voutilad avatar Apr 01 '24 16:04 voutilad

This is probably naming collision of objects on release. There may be some things we can do to resolve this, like using : helm.sh/resource-policy: keep However, i do not recommend this.

Another approach is to prepend objects that the operator needs with something lie {releasename}-operator-role.yaml but you will get funny names like operator-operator.yaml

Ultimately if this is happening maybe we should just declare this a known issue and simply not allow an operator release name of "redpanda" in general.

alejandroEsc avatar Apr 02 '24 00:04 alejandroEsc

We could have the operator check to see if there's a helm managed release with a conflicting name by looking for the secrets that helm uses for managing releases before creating the HelmRelease CR. Kinda weird that Flux doesn't do something like that but it's not worth digging into the internal of flux.

chrisseto avatar Apr 02 '24 19:04 chrisseto

@voutilad what did you expect to happen OOC? I would expect something to go wrong but I write the software. If you expected that the operator would "take over" the existing release, we might want to bump this up in priority as I expect most users would share your point of view and not ours.

If memory serves there's a doc that talks about migrating from helm to the operator. Does it recommend doing this?

chrisseto avatar Apr 02 '24 19:04 chrisseto

I’d have expected it to fail to deploy the Redpanda cluster and keep the Operator in a functioning state or generate a release name that doesn’t conflict and allows me to name my Redpanda CR how I want.

The way I view this is: how am I (the user) supposed to know I’m actually causing Helm release name collisions? From my point of view, I’m giving a name to a Redpanda custom resource.

I understand how things work underneath, but that’s only because I’ve used the Operator enough and read all the docs.

I can also envision a scenario in which someone doesn’t actually know the Helm release name for the Operator install.

voutilad avatar Apr 03 '24 10:04 voutilad

Makes sense! I propose that we make the operator fail to deploy the cluster if there's an existing helm release with a conflicting name as that's the most straightforward and least fragile solution. We'll still need to double check the migration docs before starting work on this ticket.

chrisseto avatar Apr 10 '24 14:04 chrisseto