cloud-provider-openstack icon indicating copy to clipboard operation
cloud-provider-openstack copied to clipboard

occm: LB: Fall back to TCP health mon w/o HTTP.

Open garloff opened this issue 2 years ago • 23 comments

If the Loadbalancer does not support HTTP health-monitor, we currently fail to create one. This is currently the case with OVN loadbalancers, and knowledge that HTTP health-mon is unsupported is hardcoded into occm. While HTTP is preferable and is on the roadmap for OVN, it is not straight-forward and not likely to happen very soon.

TCP health-monitoring is supported and still detects the vast majority of cases with unavailable service, e.g. situations where a node is unavailable, the listening process has died or the node is severed from the network. So TCP health-mon is much better than no health-mon.

This patch implements an unconditional fall back to TCP health monitoring if HTTP health monitoring is not available. This is similar to UDP where we already use UDP-CONNECT health mon. The patch also outputs the type of health mon into the logs.

Tested with OpenStack Zed using OvS-3.1.1 and OVN 23.03 (SCS R4), k8s-v1.26 and -v1.27 and occm-v.26.2 and -1.27. Test scenario using nginx-ingress with externalTrafficPolicy: Local.

What this PR does / why we need it: Without this patch, OVN provider octavia loadbalancers can not have health-monitors for TCP listeners, as OCCM wants to create an HTTP health-monitor, which OVN LBs don't currently support. With it, we fall back to TCP health-monitors, which still cover the vast majority of situations with non-working backend member service and is certainly much better than no health-mon. This makes OVN LBs a really attractive choice in front of a service with externalTrafficPolicy: local, as OVN LB does retain the original client IP (unlike the amphorae where you need to do workarounds with proxy protocol to get the information).

Which issue this PR fixes(if applicable): fixes #2225

Special notes for reviewers:

Containers to test are on: registry.scs.community/occm-kg/garloff/openstack-cloud-controller-manager:v1.27.103 registry.scs.community/occm-kg/garloff/openstack-cloud-controller-manager:v1.26.203

I would like to see this patch applied not just to master, but also v1.27, v1.26 and v1.25 stable branches.

Release note:

[occm] Support health-mon for OVN loadbalancers by falling back from HTTP to TCP.

garloff avatar May 06 '23 05:05 garloff

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: garloff / name: Kurt Garloff (37eba314b1a8fad207364fa46f3acbcb999cc96a)

Welcome @garloff!

It looks like this is your first PR to kubernetes/cloud-provider-openstack 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/cloud-provider-openstack has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar May 06 '23 05:05 k8s-ci-robot

Hi @garloff. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 06 '23 05:05 k8s-ci-robot

This PR addresses issue #2225.

garloff avatar May 06 '23 05:05 garloff

/ok-to-test

jichenjc avatar May 08 '23 01:05 jichenjc

/approve

@dulek please comment in case you have ,thanks

jichenjc avatar May 09 '23 08:05 jichenjc

I'm struggling to find any documentation for the other end of this health check, but this GKE documentation hints that this behaviour might not be correct: https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer-parameters#spd-hc-port

Note: The kube-proxy always responds to packets sent to the health check node port. Health checks for LoadBalancer Services cannot be answered by serving Pods.

A TCP check would presumably always return success in this case.

mdbooth avatar May 10 '23 13:05 mdbooth

I'm struggling to find any documentation for the other end of this health check, but this GKE documentation hints that this behaviour might not be correct: https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer-parameters#spd-hc-port

Note: The kube-proxy always responds to packets sent to the health check node port. Health checks for LoadBalancer Services cannot be answered by serving Pods.

A TCP check would presumably always return success in this case.

I think this piece [1] would make the health monitor to actually check the nodePort and not healthCheckNodePort in this case.

[1] https://github.com/openshift/cloud-provider-openstack/blob/a7c2d10b1add321cf1bee5f8a2c7896d51e42a40/pkg/openstack/loadbalancer.go#L1288-L1290

dulek avatar May 10 '23 14:05 dulek

I still don't understand how this would work. The documentation for the TCP health monitor says:

TCP health monitors open a TCP connection to the back-end server’s protocol port. Your custom TCP application should be written to respond OK to the load balancer connecting, opening a TCP connection, and closing it again after the TCP handshake without sending any data.

But we don't have a custom TCP application running on healthCheckNodePort which returns OK, do we?

mdbooth avatar May 10 '23 16:05 mdbooth

But we don't have a custom TCP application running on healthCheckNodePort which returns OK, do we?

And we should not need one. Normally TCP health checking means opening a TCP connection (SYN packet) and considering the connection to be successful, if the connection is established (SYN/ACK). https://www.haproxy.com/documentation/aloha/latest/load-balancing/health-checks/tcp/ That is a proper L4 check, sending data or waiting for some specific "OK" would create a L7 check (and we should then use HTTP, of course.) So I suspect the OpenStack docu to be misleading here. Will do some testing to be sure ...

garloff avatar May 10 '23 17:05 garloff

I pushed an explicit check to ensure we can only use TCP health-mon if we have a TCP LB. (The UDP case was safe before, but let's make it explicit and robust.) Testing: docker pull registry.scs.community/occm-kg/garloff/openstack-cloud-controller-manager@sha256:6c3d70c3a10e0411a6696c13634991ea033f1831f298e724c6d9343f7bc1e3fa//

garloff avatar May 10 '23 21:05 garloff

We have 2 major mismatches here:

We have Octavia documentation suggesting that a TCP check expects an OK response, but nothing will deliver that. The suggestion is that the documentation is wrong. If that's the case I'd prefer to see strong evidence of that, a long-form comment explaining this the the future similarly-confused maintainer, and preferably a linked docs patch against the Octavia docs to fix it.

If the Octavia documentation is not correct and the TCP check is indeed just a SYN/ACK check then we still have documentation (admittedly for GKE) suggesting that we will always get a response on this port whether a pod is present or not, suggesting that this isn't a valid health check.

We need code comments strongly backing up whatever choice we make here. As it stands I can't convince myself that this behaviour is correct. That's not to say it's incorrect, but it's sufficiently in doubt that we need to carefully document the assumptions we've made so they can be confidently challenged by future-us.

mdbooth avatar May 11 '23 10:05 mdbooth

We have 2 major mismatches here:

We have Octavia documentation suggesting that a TCP check expects an OK response, but nothing will deliver that. The suggestion is that the documentation is wrong. If that's the case I'd prefer to see strong evidence of that, a long-form comment explaining this the the future similarly-confused maintainer, and preferably a linked docs patch against the Octavia docs to fix it.

Sure, makes sense, anyway this wording sounds like it's just a connection being created:

TCP health monitors open a TCP connection to the back-end server’s protocol port. Your custom TCP application should be written to respond OK to the load balancer connecting, opening a TCP connection, and closing it again after the TCP handshake without sending any data.

If the Octavia documentation is not correct and the TCP check is indeed just a SYN/ACK check then we still have documentation (admittedly for GKE) suggesting that we will always get a response on this port whether a pod is present or not, suggesting that this isn't a valid health check.

In this case we won't be calling that endpoint at all as we will not use healthCheckNodePort when lbaas.canUseHTTPMonitor() returns false [1]. Healthmonitor will call the Service NodePort directly and with ETP=Local that's a good source of truth about which nodes are hosting the Service pods.

We need code comments strongly backing up whatever choice we make here. As it stands I can't convince myself that this behaviour is correct. That's not to say it's incorrect, but it's sufficiently in doubt that we need to carefully document the assumptions we've made so they can be confidently challenged by future-us.

+1 for more comments! I haven't added enough myself. We should especially add comment to [1].

[1] https://github.com/kubernetes/cloud-provider-openstack/blob/bd767ba5d4493d2aa7965d57286f1e6460e9dba9/pkg/openstack/loadbalancer.go#L1288-L1290

dulek avatar May 11 '23 10:05 dulek

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 16 '23 23:08 k8s-ci-robot

@dulek This is approved but it sounds like we have doubts about it now?

mdbooth avatar Aug 30 '23 10:08 mdbooth

@dulek This is approved but it sounds like we have doubts about it now?

Yes, we feel like the use case described in the PR and bug should just work in current state of the code. Waiting for @garloff input here.

dulek avatar Aug 30 '23 12:08 dulek

@dulek @jichenjc We should probably remove the approval. If the bot complains that I'm not allowed to do this, it's probably the right command :)

/approve cancel

mdbooth avatar Aug 30 '23 12:08 mdbooth

/remove-approve

mdbooth avatar Aug 30 '23 12:08 mdbooth

/shrug

mdbooth avatar Aug 30 '23 12:08 mdbooth

@garloff: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
openstack-cloud-controller-manager-ovn-e2e-test 7dfb7b7723f83bbe431e1ca26950975748d0509c link true /test openstack-cloud-controller-manager-ovn-e2e-test
openstack-cloud-csi-cinder-e2e-test 7dfb7b7723f83bbe431e1ca26950975748d0509c link true /test openstack-cloud-csi-cinder-e2e-test
openstack-cloud-csi-manila-sanity-test 7dfb7b7723f83bbe431e1ca26950975748d0509c link true /test openstack-cloud-csi-manila-sanity-test
openstack-cloud-csi-manila-e2e-test 7dfb7b7723f83bbe431e1ca26950975748d0509c link true /test openstack-cloud-csi-manila-e2e-test
openstack-cloud-csi-cinder-sanity-test 7dfb7b7723f83bbe431e1ca26950975748d0509c link true /test openstack-cloud-csi-cinder-sanity-test
openstack-cloud-keystone-authentication-authorization-test 7dfb7b7723f83bbe431e1ca26950975748d0509c link true /test openstack-cloud-keystone-authentication-authorization-test
openstack-cloud-controller-manager-e2e-test 7dfb7b7723f83bbe431e1ca26950975748d0509c link true /test openstack-cloud-controller-manager-e2e-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-ci-robot avatar Nov 01 '23 03:11 k8s-ci-robot

/approve cancel

jichenjc avatar Nov 29 '23 01:11 jichenjc

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign fengyunpan2 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Nov 29 '23 01:11 k8s-ci-robot

@mdbooth done I removed the approve just now

jichenjc avatar Nov 29 '23 02:11 jichenjc

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 12 '24 16:03 k8s-triage-robot

This one is no longer needed. Thanks for all the explanations and doc improvements!

garloff avatar Mar 12 '24 18:03 garloff