Fix k8s nat manager logic and add `--Xnat-kube-service-namespace` flag
PR description
Now the Kubernetes nat manager requires cluster-wide permissions (list service resources in all namespaces). Also, it has a bug: when multiple besu networks are configured in the same cluster with the same service names in different namespaces, the nat manager may set an inappropriate IP address for the besu node due to enumerating services from all namespaces and checking only their names.
I suggest adding a new flag: --Xnat-kube-service-namespace to specify a concrete besu node service namespace and not use a list of services, but a get request for a specific service.
Fixed Issue(s)
fixes #5002
- [ ] I thought about documentation and added the
doc-change-requiredlabel to this PR if updates are required. - [ ] I thought about the changelog and included a changelog update if required.
- [ ] If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests
@alex123012 there's a couple of unit tests failing. you should be able to reproduce locally
Task :besu:test
BesuCommandTest > natManagerServiceNameCannotBeUsedWithNatDockerMethod FAILED java.lang.AssertionError at BesuCommandTest.java:1971
BesuCommandTest > natManagerServiceNameCannotBeUsedWithNatNoneMethod FAILED java.lang.AssertionError at BesuCommandTest.java:1981
code looks ok - @alex123012 how would I verify that it's working? new to kubernetes
So, you could try to use kind/minikube with https://github.com/Consensys/quorum-kubernetes repo with changed image. I've tested it with ibft2.
Something like a plan:
- Install minikube and kubectl.
- Build the besu image and load it to the minikube cluster (for example, with cache or other option in docs).
- Change images of besu in k8s manifests to image you've build (for example, here and you'll need to change it in all other files in this dir like in the example).
- Namespace in code is setted to
besuby default, so you can omit the new flag, by you can explicitly set it here and in all other files in this dir. - Go through quorum-kubernetes docs (minikube cluster is created in it).
- If all is good - check besu logs and find entries about K8S manager - it should have no errors
Note: for more restricted rules, to check, that all is ok - delete
"list"verb fromRolemanifest and in other files in this dir do the same.
If you encounter any problems, you can write to me in telegram or email ([email protected])
@macfarla Wrote a few notes^
@macfarla Hi! Any updates here?:)
@cdivitotawela any input on this PR?
@cdivitotawela any input on this PR?
Let me check whether I can run this on minikube and verify,
Testing successful with Kubernetes in AWS with v24.5.2 of Besu. There was one issue I ran into, I had to name the port 'discovery' or it would throw an error. I would have liked to been able to name it 'discovery-tcp' or 'p2p-tcp'.
Just wondering after talking to some others about K8S behaviour. Does the namespace need manually specifying via a config option, or can the K8S namespace of the Besu pod just be queried and used in discovering the service endpoint? A K8S service can't serve a pod in a different namespace anyway, so it's not clear if there's a reason to add the new --Xnat-kube-service-namespace instead of auto-discovering the pod's namespace.
@alex123012 to simplify besu codebase we are planning to deprecate k8s NAT method, so we are reluctant to make changes to it at this stage.
Suggestion from our devops team - I wound up just keeping it simple and using an external service to return the IP https://github.com/Consensys/ethereum-helm-charts/blob/main/charts/ethereum-helm-charts/charts/elc/templates/besu/statefulset.yml#L66 We bind directly to the host node's IP and steps to get things setup on the infra side are here https://besu.hyperledger.org/public-networks/tutorials/kubernetes
This still keeps the RBAC permissions intact and doesn't require admin rights which I prefer. Control goes back to the owner and they decide how/what is allowed
@alex123012 are you still engaged on this PR? it's marked as stale and will be closed in 2 weeks if no activity
Hi @alex123012
We have made the decision to deprecate k8s with nat manager. So we don't want to add to this section of the code base. We've got solutions in place that bypass nat manager for k8s and thats working for private, and public nodes that we run. Our preference is to keep it simple and clean for the long term and not more band aid patches for another service mechanism or cloud provider.
We recognize that you have gone through a lot of trial and error to create this PR but that's kinda on us for leaving this as is and not making the call earlier - and is also a symptom of the issue that we besu developers aren't experts in this part of the code base, it's not our core expertise.
If you want to keep this functionality and extend, you are welcome to fork and maintain - or consider converting the functionality to a plugin long-term?
I'm closing this PR, hope you understand the maintainers point of view here.