orbstack icon indicating copy to clipboard operation
orbstack copied to clipboard

K8s Ingress stopped working in 1.6.x

Open hmeerlo opened this issue 1 year ago • 15 comments

Describe the bug

I upgraded OrbStack to 1.6.0 (and 1.6.1 after that). After the upgrade my Ingresses stopped working. I deleted the nginx ingress controller and re-applied the yaml files but no luck. The state is like this forever:

Normal  Sync    32s    nginx-ingress-controller  Scheduled for sync

The docs still refer to installing version 1.8.1 of the nginx ingress controller. Is that still valid?

Also there is no listener on port 80 anymore, I used to be able to do this:

➜ curl -v http://accounts.k8s.orb.local
*   Trying 198.19.249.2:80...
* connect to 198.19.249.2 port 80 failed: Connection refused

To Reproduce

Fo me it was upgrading from 1.5.1 to 1.6.0

Expected behavior

Ingress keeps working

Diagnostic report (REQUIRED)

OrbStack info: Version: 1.6.1 Commit: bfaddc6839de8b00b7aff767dbd673ac6ad4259e (v1.6.1)

System info: macOS: 14.1 (23B74) CPU: arm64, 10 cores CPU model: Apple M1 Pro Model: MacBookPro18,3 Memory: 32 GiB

Full report: https://orbstack.dev/_admin/diag/orbstack-diagreport_2024-05-30T14-57-02.437635Z.zip

Screenshots and additional context (optional)

No response

hmeerlo avatar May 30 '24 14:05 hmeerlo

Sorry about the confusion of closing and reopening this issue. I thought I saw the problem but that wasn't the case. The Nginx Ingress controller tries to start but it fails in the init container:

➜ kubectl logs -n kube-system svclb-ingress-nginx-controller-9998ea23-rk5ll
Defaulted container "lb-tcp-80" out of: lb-tcp-80, lb-tcp-443
+ trap exit TERM INT
+ BIN_DIR=/sbin
+ check_iptables_mode
+ set +e
+ lsmod
[INFO]  legacy mode detected
+ grep -qF nf_tables
+ '[' 1 '=' 0 ]
+ mode=legacy
+ set -e
+ info 'legacy mode detected'
+ set_legacy
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables-save
+ ln -sf /sbin/xtables-legacy-multi /sbin/iptables-restore
+ ln -sf /sbin/xtables-legacy-multi /sbin/ip6tables
+ start_proxy
+ echo 0.0.0.0/0
+ grep -Eq :
+ iptables -t filter -I FORWARD -s 0.0.0.0/0 -p TCP --dport 31997 -j ACCEPT
+ echo 198.19.249.2
+ grep -Eq :
+ cat /proc/sys/net/ipv4/ip_forward
+ '[' 1 '==' 1 ]
+ iptables -t filter -A FORWARD -d 198.19.249.2/32 -p TCP --dport 31997 -j DROP
+ iptables -t nat -I PREROUTING -p TCP --dport 80 -j DNAT --to 198.19.249.2:31997
+ iptables -t nat -I POSTROUTING -d 198.19.249.2/32 -p TCP -j MASQUERADE
+ echo fd07:b51a:cc66:0:1878:30ff:fe64:16a9
+ grep -Eq :
+ cat /proc/sys/net/ipv6/conf/all/forwarding
+ '[' 0 '==' 1 ]
+ exit 1

i.e. it complains about /proc/sys/net/ipv6/conf/all/forwarding not being set. Do I have any influence on this?

hmeerlo avatar May 30 '24 17:05 hmeerlo

More info, this is caused by the env value of DEST_IPS being (v1:status.hostIPs):

Name:             svclb-ingress-nginx-controller-9998ea23-ctvrh
Namespace:        kube-system
Priority:         0
Service Account:  svclb
Node:             orbstack/198.19.249.2
Start Time:       Thu, 30 May 2024 19:28:30 +0200
Labels:           app=svclb-ingress-nginx-controller-9998ea23
                  controller-revision-hash=b4bd7cf5d
                  pod-template-generation=1
                  svccontroller.k3s.cattle.io/svcname=ingress-nginx-controller
                  svccontroller.k3s.cattle.io/svcnamespace=ingress-nginx
Annotations:      <none>
Status:           Running
IP:               192.168.194.26
IPs:
  IP:           192.168.194.26
  IP:           fd07:b51a:cc66:a::3:f302
Controlled By:  DaemonSet/svclb-ingress-nginx-controller-9998ea23
Containers:
  lb-tcp-80:
    Container ID:   docker://f52665e5253f8e005900a7f365a2f566a895105121a4a5a1c9ac5ffa1ce168fb
    Image:          rancher/klipper-lb:v0.4.7
    Image ID:       docker-pullable://rancher/klipper-lb@sha256:558dcf96bf0800d9977ef46dca18411752618cd9dd06daeb99460c0a301d0a60
    Port:           80/TCP
    Host Port:      80/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 30 May 2024 19:31:35 +0200
      Finished:     Thu, 30 May 2024 19:31:35 +0200
    Ready:          False
    Restart Count:  5
    Environment:
      SRC_PORT:    80
      SRC_RANGES:  0.0.0.0/0
      DEST_PROTO:  TCP
      DEST_PORT:   31997
      DEST_IPS:     (v1:status.hostIPs)
<etc>

Because this contains a : it thinks that it needs to route to an ipv6 address

hmeerlo avatar May 30 '24 17:05 hmeerlo

This is the problem (and it contains a workaround): https://github.com/k3s-io/k3s/issues/9949

hmeerlo avatar May 30 '24 17:05 hmeerlo

@hmeerlo Thanks for the link for k3s issue. We will update k3s after the fix is merged. Are you able to work around it?

slinorb avatar May 30 '24 18:05 slinorb

Yes the workaround is in the k3s issue. It means adding this to the DaemonSet:

          - name: net.ipv6.conf.all.forwarding
            value: '1'

hmeerlo avatar May 31 '24 08:05 hmeerlo

Thanks, fixed for the next version.

kdrag0n avatar May 31 '24 09:05 kdrag0n

Released in v1.6.2.

kdrag0n avatar Jun 13 '24 04:06 kdrag0n

@kdrag0n I still experience this problem in 1.6.2:

lb-tcp-80 + iptables -t nat -I PREROUTING -p TCP --dport 80 -j DNAT --to 198.19.249.2:31997                                                                                                                                                                                                                                                                                                                                                                                  lb-tcp-80 + iptables -t nat -I POSTROUTING -d 198.19.249.2/32 -p TCP -j MASQUERADE                                                                                                                                                                                                                                                                                                                                                                                           
lb-tcp-80 + echo fd07:b51a:cc66::2                                                                                                                                                                                                                                                                                                                                                                                                                                           
lb-tcp-80 + grep -Eq :                                                                                                                                                                                                                                                                                                                                                                                                                                                       
lb-tcp-80 + cat /proc/sys/net/ipv6/conf/all/forwarding                                                                                                                                                                                                                                                                                                                                                                                                                       
lb-tcp-80 + '[' 0 '==' 1 ]                                                                                                                                                                                                                                                                                                                                                                                                                                                   
lb-tcp-80 + exit 1

So it still will not start the ingress controller service.

hmeerlo avatar Jun 17 '24 14:06 hmeerlo

FYI, it is not fixed in 1.6.3

hmeerlo avatar Jul 05 '24 07:07 hmeerlo

I double checked, but even when I disable IPv6 in OrbStack and I create a new k8s cluster, it still gets ipv6 address for the node:

➜ kubectl describe nodes orbstack
Name:               orbstack
Roles:              control-plane,master
Labels:             beta.kubernetes.io/arch=arm64
                    beta.kubernetes.io/instance-type=k3s
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=arm64
                    kubernetes.io/hostname=orbstack
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=true
                    node-role.kubernetes.io/master=true
                    node.kubernetes.io/instance-type=k3s
Annotations:        alpha.kubernetes.io/provided-node-ip: 198.19.249.2,fd07:b51a:cc66::2
                    flannel.alpha.coreos.com/backend-data: null
                    flannel.alpha.coreos.com/backend-type: host-gw
                    flannel.alpha.coreos.com/backend-v6-data: null
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 198.19.249.2
                    flannel.alpha.coreos.com/public-ipv6: fd07:b51a:cc66::2
                    k3s.io/hostname: orbstack
                    k3s.io/internal-ip: 198.19.249.2,fd07:b51a:cc66::2

Screenshot 2024-07-05 at 12 02 59

hmeerlo avatar Jul 05 '24 10:07 hmeerlo

Oh did some more digging (apparently its just k3s under the hood). It is caused by the externalTrafficPolicy of the Service. When it is set to Local instead of Cluster it fails because then k3s sets the DEST_IPS to status.hostIPs instead of the actual IPv4 addresses.

hmeerlo avatar Jul 05 '24 11:07 hmeerlo

any update ?

bostin avatar Jul 25 '24 00:07 bostin

is this fixed?

bostin avatar Sep 04 '24 14:09 bostin

@bostin I'm still running into this on the latest version of OrbStack. The suggested workarounds offered by @hmeerlo solved the problem for me (setting externalTrafficPolicy to Cluster in the Nginx DaemonSet, and also setting net.ipv6.conf.all.forwarding to '1').

FindAPattern avatar Sep 26 '24 13:09 FindAPattern

@bostin I'm still running into this on the latest version of OrbStack. The suggested workarounds offered by @hmeerlo solved the problem for me (setting externalTrafficPolicy to Cluster in the Nginx DaemonSet, and also setting net.ipv6.conf.all.forwarding to '1').

@FindAPattern Thank you for your reply. What are the detailed steps for configuring OrbStack?

bostin avatar Sep 27 '24 00:09 bostin

I'm also having this issue in the latest version

ludvigsen avatar Oct 10 '24 08:10 ludvigsen

@bostin I'm still running into this on the latest version of OrbStack. The suggested workarounds offered by @hmeerlo solved the problem for me (setting externalTrafficPolicy to Cluster in the Nginx DaemonSet, and also setting net.ipv6.conf.all.forwarding to '1').

@FindAPattern Thank you for your reply. What are the detailed steps for configuring OrbStack?

@bostin I encountered the same issue, and setting net.ipv6.conf.all.forwarding to ‘1’ didn’t work for me. Every time I saved the changes, I found that the DaemonSet wasn’t successfully updated. I eventually found that simply setting externalTrafficPolicy to Cluster resolved the problem. Specifically, find ingress-nginx-controller in the Service, edit its YAML, and change externalTrafficPolicy: Local to externalTrafficPolicy: Cluster. That fixed the issue for me, no need to configure OrbStack.

realmorrisliu avatar Oct 11 '24 02:10 realmorrisliu

fwiw, this is still happening o 1.9.1. Changing externalTrafficPolicy to Cluster and setting net.ipv6.conf.all.forwarding=1 on the daemonset works, but gets reset.

I'm using envoy if that's of any use.

jerguslejko avatar Dec 13 '24 20:12 jerguslejko

@bostin I'm still running into this on the latest version of OrbStack. The suggested workarounds offered by @hmeerlo solved the problem for me (setting externalTrafficPolicy to Cluster in the Nginx DaemonSet, and also setting net.ipv6.conf.all.forwarding to '1').

@FindAPattern Thank you for your reply. What are the detailed steps for configuring OrbStack?

@bostin I encountered the same issue, and setting net.ipv6.conf.all.forwarding to ‘1’ didn’t work for me. Every time I saved the changes, I found that the DaemonSet wasn’t successfully updated. I eventually found that simply setting externalTrafficPolicy to Cluster resolved the problem. Specifically, find ingress-nginx-controller in the Service, edit its YAML, and change externalTrafficPolicy: Local to externalTrafficPolicy: Cluster. That fixed the issue for me, no need to configure OrbStack.

Thanks, I will try this later.

bostin avatar Dec 18 '24 01:12 bostin

Fixed in v1.9.4.

slinorb avatar Jan 15 '25 04:01 slinorb