occm: LB: Fall back to TCP health mon w/o HTTP.
If the Loadbalancer does not support HTTP health-monitor, we currently fail to create one. This is currently the case with OVN loadbalancers, and knowledge that HTTP health-mon is unsupported is hardcoded into occm. While HTTP is preferable and is on the roadmap for OVN, it is not straight-forward and not likely to happen very soon.
TCP health-monitoring is supported and still detects the vast majority of cases with unavailable service, e.g. situations where a node is unavailable, the listening process has died or the node is severed from the network. So TCP health-mon is much better than no health-mon.
This patch implements an unconditional fall back to TCP health monitoring if HTTP health monitoring is not available. This is similar to UDP where we already use UDP-CONNECT health mon. The patch also outputs the type of health mon into the logs.
Tested with OpenStack Zed using OvS-3.1.1 and OVN 23.03 (SCS R4), k8s-v1.26 and -v1.27 and occm-v.26.2 and -1.27. Test scenario using nginx-ingress with externalTrafficPolicy: Local.
What this PR does / why we need it: Without this patch, OVN provider octavia loadbalancers can not have health-monitors for TCP listeners, as OCCM wants to create an HTTP health-monitor, which OVN LBs don't currently support. With it, we fall back to TCP health-monitors, which still cover the vast majority of situations with non-working backend member service and is certainly much better than no health-mon. This makes OVN LBs a really attractive choice in front of a service with externalTrafficPolicy: local, as OVN LB does retain the original client IP (unlike the amphorae where you need to do workarounds with proxy protocol to get the information).
Which issue this PR fixes(if applicable): fixes #2225
Special notes for reviewers:
Containers to test are on: registry.scs.community/occm-kg/garloff/openstack-cloud-controller-manager:v1.27.103 registry.scs.community/occm-kg/garloff/openstack-cloud-controller-manager:v1.26.203
I would like to see this patch applied not just to master, but also v1.27, v1.26 and v1.25 stable branches.
Release note:
[occm] Support health-mon for OVN loadbalancers by falling back from HTTP to TCP.
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: garloff / name: Kurt Garloff (37eba314b1a8fad207364fa46f3acbcb999cc96a)
Welcome @garloff!
It looks like this is your first PR to kubernetes/cloud-provider-openstack 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes/cloud-provider-openstack has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @garloff. Thanks for your PR.
I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
This PR addresses issue #2225.
/ok-to-test
/approve
@dulek please comment in case you have ,thanks
I'm struggling to find any documentation for the other end of this health check, but this GKE documentation hints that this behaviour might not be correct: https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer-parameters#spd-hc-port
Note: The kube-proxy always responds to packets sent to the health check node port. Health checks for LoadBalancer Services cannot be answered by serving Pods.
A TCP check would presumably always return success in this case.
I'm struggling to find any documentation for the other end of this health check, but this GKE documentation hints that this behaviour might not be correct: https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer-parameters#spd-hc-port
Note: The kube-proxy always responds to packets sent to the health check node port. Health checks for LoadBalancer Services cannot be answered by serving Pods.
A TCP check would presumably always return success in this case.
I think this piece [1] would make the health monitor to actually check the nodePort and not healthCheckNodePort in this case.
[1] https://github.com/openshift/cloud-provider-openstack/blob/a7c2d10b1add321cf1bee5f8a2c7896d51e42a40/pkg/openstack/loadbalancer.go#L1288-L1290
I still don't understand how this would work. The documentation for the TCP health monitor says:
TCP health monitors open a TCP connection to the back-end server’s protocol port. Your custom TCP application should be written to respond OK to the load balancer connecting, opening a TCP connection, and closing it again after the TCP handshake without sending any data.
But we don't have a custom TCP application running on healthCheckNodePort which returns OK, do we?
But we don't have a custom TCP application running on
healthCheckNodePortwhich returnsOK, do we?
And we should not need one. Normally TCP health checking means opening a TCP connection (SYN packet) and considering the connection to be successful, if the connection is established (SYN/ACK). https://www.haproxy.com/documentation/aloha/latest/load-balancing/health-checks/tcp/ That is a proper L4 check, sending data or waiting for some specific "OK" would create a L7 check (and we should then use HTTP, of course.) So I suspect the OpenStack docu to be misleading here. Will do some testing to be sure ...
I pushed an explicit check to ensure we can only use TCP health-mon if we have a TCP LB. (The UDP case was safe before, but let's make it explicit and robust.) Testing: docker pull registry.scs.community/occm-kg/garloff/openstack-cloud-controller-manager@sha256:6c3d70c3a10e0411a6696c13634991ea033f1831f298e724c6d9343f7bc1e3fa//
We have 2 major mismatches here:
We have Octavia documentation suggesting that a TCP check expects an OK response, but nothing will deliver that. The suggestion is that the documentation is wrong. If that's the case I'd prefer to see strong evidence of that, a long-form comment explaining this the the future similarly-confused maintainer, and preferably a linked docs patch against the Octavia docs to fix it.
If the Octavia documentation is not correct and the TCP check is indeed just a SYN/ACK check then we still have documentation (admittedly for GKE) suggesting that we will always get a response on this port whether a pod is present or not, suggesting that this isn't a valid health check.
We need code comments strongly backing up whatever choice we make here. As it stands I can't convince myself that this behaviour is correct. That's not to say it's incorrect, but it's sufficiently in doubt that we need to carefully document the assumptions we've made so they can be confidently challenged by future-us.
We have 2 major mismatches here:
We have Octavia documentation suggesting that a TCP check expects an
OKresponse, but nothing will deliver that. The suggestion is that the documentation is wrong. If that's the case I'd prefer to see strong evidence of that, a long-form comment explaining this the the future similarly-confused maintainer, and preferably a linked docs patch against the Octavia docs to fix it.
Sure, makes sense, anyway this wording sounds like it's just a connection being created:
TCP health monitors open a TCP connection to the back-end server’s protocol port. Your custom TCP application should be written to respond OK to the load balancer connecting, opening a TCP connection, and closing it again after the TCP handshake without sending any data.
If the Octavia documentation is not correct and the TCP check is indeed just a SYN/ACK check then we still have documentation (admittedly for GKE) suggesting that we will always get a response on this port whether a pod is present or not, suggesting that this isn't a valid health check.
In this case we won't be calling that endpoint at all as we will not use healthCheckNodePort when lbaas.canUseHTTPMonitor() returns false [1]. Healthmonitor will call the Service NodePort directly and with ETP=Local that's a good source of truth about which nodes are hosting the Service pods.
We need code comments strongly backing up whatever choice we make here. As it stands I can't convince myself that this behaviour is correct. That's not to say it's incorrect, but it's sufficiently in doubt that we need to carefully document the assumptions we've made so they can be confidently challenged by future-us.
+1 for more comments! I haven't added enough myself. We should especially add comment to [1].
[1] https://github.com/kubernetes/cloud-provider-openstack/blob/bd767ba5d4493d2aa7965d57286f1e6460e9dba9/pkg/openstack/loadbalancer.go#L1288-L1290
PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@dulek This is approved but it sounds like we have doubts about it now?
@dulek This is approved but it sounds like we have doubts about it now?
Yes, we feel like the use case described in the PR and bug should just work in current state of the code. Waiting for @garloff input here.
@dulek @jichenjc We should probably remove the approval. If the bot complains that I'm not allowed to do this, it's probably the right command :)
/approve cancel
/remove-approve
/shrug
@garloff: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| openstack-cloud-controller-manager-ovn-e2e-test | 7dfb7b7723f83bbe431e1ca26950975748d0509c | link | true | /test openstack-cloud-controller-manager-ovn-e2e-test |
| openstack-cloud-csi-cinder-e2e-test | 7dfb7b7723f83bbe431e1ca26950975748d0509c | link | true | /test openstack-cloud-csi-cinder-e2e-test |
| openstack-cloud-csi-manila-sanity-test | 7dfb7b7723f83bbe431e1ca26950975748d0509c | link | true | /test openstack-cloud-csi-manila-sanity-test |
| openstack-cloud-csi-manila-e2e-test | 7dfb7b7723f83bbe431e1ca26950975748d0509c | link | true | /test openstack-cloud-csi-manila-e2e-test |
| openstack-cloud-csi-cinder-sanity-test | 7dfb7b7723f83bbe431e1ca26950975748d0509c | link | true | /test openstack-cloud-csi-cinder-sanity-test |
| openstack-cloud-keystone-authentication-authorization-test | 7dfb7b7723f83bbe431e1ca26950975748d0509c | link | true | /test openstack-cloud-keystone-authentication-authorization-test |
| openstack-cloud-controller-manager-e2e-test | 7dfb7b7723f83bbe431e1ca26950975748d0509c | link | true | /test openstack-cloud-controller-manager-e2e-test |
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
/approve cancel
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign fengyunpan2 for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
@mdbooth done I removed the approve just now
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
This one is no longer needed. Thanks for all the explanations and doc improvements!