cloudflare-operator icon indicating copy to clipboard operation
cloudflare-operator copied to clipboard

Recreating a ClusterTunnel does not cause the tunnelbinding to reattach

Open cyclingwithelephants opened this issue 9 months ago • 18 comments

If I delete and re-create a cluster tunnel (for example to update the image used, because the controller does not respect updates to the ClusterTunnel) I am also forced to re-create the tunnelbindings as they think they are attached to the old cluster tunnel

cyclingwithelephants avatar Apr 28 '25 21:04 cyclingwithelephants

Interesting, yeah there is a way to mark resources as preventing deletion of its parent with ownerRefs (if what I understood was correct from https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/). That would solve this problem partially by letting you only update the CRD instead of a recreate, rest of which can be solved by the issue you linked.

~Edit: I'm concerned as to the side effects of recreating the cluster tunnel without deleting the tunnel binding since its finalizers cannot be run. If the cluster tunnel is changed to point to a new domain/tunnel, the deletion finalizers would fail and be in an unexpected state. Would it be better to prevent its deletion/modification and recommend creating a new cluster tunnel instead?~

adyanth avatar Apr 29 '25 03:04 adyanth

Okay, I have an idea for this. Instead of each tunnel binding triggering the config map (which looks at all other tunnel bindings to recreate the ConfigMap), we can have 1 empty tunnel binding created per tunnel/ClusterTunnel, which is also marked as owner of the tunnel/ClusterTunnel. This means that we can let it "resync" any time the owner is changed and repopulate the configs.

Let me know if that makes sense.

Edit: TunnelBindings are namespace scoped, but tunnels can be namespace or cluster scoped. So, if this is on a cluster scoped tunnel, would need to fall back to the operator namespace.

adyanth avatar Apr 29 '25 06:04 adyanth

Trying that solution in #140

adyanth avatar Apr 29 '25 09:04 adyanth

If we merge https://github.com/adyanth/cloudflare-operator/pull/136#pullrequestreview-2801873426 then I think a finalizer is enough because the reasons for deleting a tunnel/cluster tunnel in regular use go away. I'm a little hesitant about creating dummy resources

At that point, it becomes about managing the cascading delete, we need to decide which way we want to go:

  • Deleting the tunnel/clustertunnel deletes the tunnelbindings (i.e. we set ownerReferences.blockOwnerDeletion=false)
  • The existence of tunnelbindings prevents deletion of the tunnel/clustertunnel (i.e. we set ownerReferences.blockOwnerDeletion=true)

I can see arguments either way and I'm not sure I have an opinion on that at this time

cyclingwithelephants avatar Apr 29 '25 10:04 cyclingwithelephants

Drawing inspiration from role/rolebindings, they just work, but that could be due to the fact that they are checked when needed, unlike our tunnel binding. I'm a little hesitant in deleting the tunnel bindings on a tunnel deletion because it would cause some amount of churn with DNS records being deleted. Preventing deletion of a tunnel is reasonable, but would need to be enforced by the operator, k8s would not do it afaict.

The dummy empty tunnel binding works surprisingly well in my limited testing, so if you want to give it a shot and see if that is reasonable, that might work as well. The lifecycle of that is tied to the tunnel, and the act of creating it updates the config map by re discovering all orphaned tunnel bindings on a recreate.

adyanth avatar Apr 29 '25 12:04 adyanth

I did some research and it looks like setting ownership isn't complicated and makes this problem fairly easy to solve. It's an AI chat so take it with a grain of salt. How does this approach look to you?

I think what makes me nervous about this empty tunnel is that it becomes its own class of object we have to treat specially, I think if we can get the ownership working relatively simply it feels a bit more canonical.

Having said all this, we might be able to ignore this for now. The reason I was needing to manage this was because cloudflared images couldn't be updated without re-creating the tunnel/clustertunnel

cyclingwithelephants avatar Apr 29 '25 13:04 cyclingwithelephants

Owns does not work because we would end up with multiple owners (Each tunnel binding trying to own the tunnel). Watches is what I tried initially, but saw two issues (more of not good practices).

First one being each tunnel binding would reconcile itself on the tunnel being recreated, which is not necessary, since one tunnel binding reconciles all existing tunnel bindings.

Second one being the handler.TypedEventHandler used by Watches is marked experimental here and here

But as you say, we might be able to not do this for now.

adyanth avatar Apr 30 '25 02:04 adyanth

ah, I was thinking that the tunnelbinding would be owned by the tunnel/clustertunnel. gateway API does a similar with with parentref

First one being each tunnel binding would reconcile itself on the tunnel being recreated, which is not necessary, since one tunnel binding reconciles all existing tunnel bindings.

How does this work? Is each tunnelBinding not an independent patch operation on the config file?

To be clear, long term I think we should find a way to make this work, but the priority is much lower imo in light of https://github.com/adyanth/cloudflare-operator/pull/136, and it sounds like you agree

cyclingwithelephants avatar Apr 30 '25 10:04 cyclingwithelephants

I'll look into how the gateway api controllers work some time.

How does this work? Is each tunnelBinding not an independent patch operation on the config file?

Each tunnel binding reconciliation looks at all other tunnel bindings for that tunnel and regenerates the config file. (That's the reason a new dummy tunnel binding works to reconcile them all). Patching a configMap config's list seemed brittle especially there could be multiple reconciliations scheduled. Open to improvement if you have any ideas.

adyanth avatar Apr 30 '25 10:04 adyanth

I'll look into how the gateway api controllers work some time

I have the most relevant example you could possibly look at haha

I'm writing this as I've just experienced a total outage after a sqlite -> etcd migration went wrong in my k3s cluster, and I had a fun time getting the routing to work. It would be really nice if this just worked™

I wonder if the way to do this is to make each tunnelBinding work independently of other tunnelbindings, and make patching determinstic. My first example doesn't handle deletes IIRC, but that shouldn't be too hard

cyclingwithelephants avatar May 04 '25 13:05 cyclingwithelephants

How do you handle conflicts from multiple tunnel bindings trying to update the configs simultaneously? Would it be just a retry loop?

Maybe an ORM style mapping from the config file's list to a map may be a nice QOL improvement and help with upserts and deletes.

adyanth avatar May 06 '25 06:05 adyanth

etcd handles concurrent writes pretty well, and it's eventually consistent so even if there were a concurrency problem it gets sorted out by itself

if tunnelbindings conflict in terms of routing rules, that would need to be validated I suppose with a webhook admission controller in the same way as is done for ingress resources (so it can be validated at tunnelbinding deploy time and the controller doesn't need to worry about it)

cyclingwithelephants avatar May 06 '25 14:05 cyclingwithelephants

By concurrency, I mean when the k8s API rejects the patch because the state changed underneath.

Interesting approach with the validating web hook for detecting conflicts!

adyanth avatar May 06 '25 14:05 adyanth

Ah, if it rejects the patch presumably that would be handled by the eventually consistent nature of this patching? Or am I missing something?

cyclingwithelephants avatar May 06 '25 14:05 cyclingwithelephants

Yeah it depends of if we requeue or not, or handle it with a retry in the same reconcile call. I'm hesitating just because the patch failing and getting reapplied in a retry would probably be incorrect, so would need to requeue. Applying a couple 10s of tunnel bindings would generate a magnifying number of requests for reconciliation as the conflicts grow is my concern.

adyanth avatar May 06 '25 14:05 adyanth

I think we can potentially use exponential backoff with jitter to solve this, thoughts?

cyclingwithelephants avatar May 06 '25 18:05 cyclingwithelephants

I don't think a requeue can know that it is a requeue?

Even with that, an exponential backoff quickly goes into minutes given enough things reconciling, making the end user experience worse as they have to wait for so long after applying.

adyanth avatar May 07 '25 00:05 adyanth

I don't think a requeue can know that it is a requeue?

Oh true.

I think jitter might be enough actually right? Requeue for something between 1 and 10 (or more) seconds and it should smooth out the reconciles if indeed there is any conflict in the first place

cyclingwithelephants avatar May 07 '25 20:05 cyclingwithelephants