calico icon indicating copy to clipboard operation
calico copied to clipboard

GlobalNetworkPolicy Using `notNets` and `0.0.0.0/0` Causes `calico-node` Pods to Crash

Open mike-weiner opened this issue 1 year ago • 2 comments

Applying the following GlobalNetworkPolicy causes the calico-node pods to crash as they fail to update the iptables rules and the pods eventually hit a panic. Ultimately this results in the calico-node pods to become not ready.

The key here is that this bug only appears to happen on hosts using RHEL or Red Hat CoreOS as the operating system. I am unable to replicate this on Ubuntu based hosts.

GlobalNetworkPolicy:

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: demo
spec:
  applyOnForward: true
  ingress:
  - action: Deny
    destination:
      ports:
      - <omitted>
    protocol: TCP
    source:
      notNets:
      - 0.0.0.0/0
  order: 1500
  preDNAT: true
  selector: <omitted>
  types:
  - Ingress

Expected Behavior

I would expect to be able to create a GNP that uses 0.0.0.0/, that the iptables rules can be updated correctly, and that the calico-node pods wouldn't crash.

Current Behavior

I would expect that if using 0.0.0.0/0 in a GlobalNetworkPolicy is unsupported or going to cause calico-node to crash, then I would not expect to be able to create the GNP. See the steps to reproduce this below.

Possible Solution

If using 0.0.0.0/0 is not supported in a GNP, then can we work to not allow GNPs to be created that might be using it? As a temporary workaround, changing the above GNP to use 1.1.1.1/32 does not cause the same issue that results in the calico-node pods to crash.

GlobalNetworkPolicy:

apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
  name: demo
spec:
  applyOnForward: true
  ingress:
  - action: Deny
    destination:
      ports:
      - <omitted>
    protocol: TCP
    source:
      notNets:
      - 1.1.1.1/32
  order: 1500
  preDNAT: true
  selector: <omitted>
  types:
  - Ingress

Steps to Reproduce (for bugs)

  1. Create an OpenShift cluster that uses RHEL or CoreOS as the OS for the cluster's worker nodes.
  2. Apply the GNP listed above.
  3. Run oc rollout restart ds -n calico-system calico-node to restart the calico-node DaemonSet.
  4. The calcio-node pods will start but then panic. Dumping out the logs should yield something similar to the following:
weiner ~ % kubectl logs -n calico-system calico-node-g2b9h --tail=250
Defaulted container "calico-node" out of: calico-node, install-cni (init)
.
<OMITTED FOR BREVITY>
.
2024-09-05 01:40:25.585 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.588 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.618 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n:cali-fh-ens3 - -\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:25.618 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.623 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.679 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n:cali-fh-ens3 - -\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:25.679 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.688 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.726 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-fh-ens3 - -\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:25.727 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.743 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.772 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n:cali-fh-ens3 - -\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:25.772 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.805 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.837 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n:cali-fh-ens3 - -\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:25.837 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.902 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:25.933 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n:cali-fh-ens3 - -\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:25.934 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.063 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.096 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n:cali-fh-ens3 - -\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:26.096 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.353 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.397 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-fh-ens3 - -\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:26.398 [WARNING][3796] felix/table.go 1032: Failed to program iptables, will retry error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.911 [WARNING][3796] felix/table.go 1035: Retrying... error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.939 [WARNING][3796] felix/table.go 1409: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): \nline 16: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_UQtJSt51HPofoxgw2Ta\n" input="*mangle\n:cali-from-host-endpoint - -\n:cali-failsafe-out - -\n:cali-failsafe-in - -\n:cali-pi-_UQtJSt51HPofoxgw2Ta - -\n:cali-fh-ens3 - -\n-A cali-from-host-endpoint -m comment --comment \"cali:p1MmP3iYm6Wo6_ox\" --in-interface ens3 --goto cali-fh-ens3\n-A cali-pi-_UQtJSt51HPofoxgw2Ta -m comment --comment \"cali:i-9cyrLWTwUoqGQr\" -m comment --comment \"Policy default.mweiner-db2-allowlist-test ingress\" -p tcp -m multiport --destination-ports 30000:32760 ! --source 0.0.0.0/0 --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:1rTPISOVtZDBddv3\" -m conntrack --ctstate RELATED,ESTABLISHED --jump ACCEPT\n-A cali-fh-ens3 -m comment --comment \"cali:lhkU3lrj1voPrnuq\" -m conntrack --ctstate INVALID --jump DROP\n-A cali-fh-ens3 -m comment --comment \"cali:kRTULA6jEcBy3Esr\" --jump cali-failsafe-in\n-A cali-fh-ens3 -m comment --comment \"cali:HS0aeSKiUBCYj1PQ\" --jump MARK --set-mark 0/0x10000\n-A cali-fh-ens3 -m comment --comment \"cali:n447PYjYxu8ONymc\" -m comment --comment \"Start of policies\" --jump MARK --set-mark 0/0x20000\n-A cali-fh-ens3 -m comment --comment \"cali:OQTSdUOzf6ci66bv\" -m mark --mark 0/0x20000 --jump cali-pi-_UQtJSt51HPofoxgw2Ta\n-A cali-fh-ens3 -m comment --comment \"cali:u5iZUM1M-AJpghYV\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 --jump RETURN\nCOMMIT\n" ipVersion=0x4 output="" table="mangle"
2024-09-05 01:40:26.939 [ERROR][3796] felix/table.go 1039: Failed to program iptables, loading diags before panic. error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.953 [ERROR][3796] felix/table.go 1045: Current state of iptables ipVersion=0x4 iptablesState="# Generated by iptables-nft-save v1.8.8 (nf_tables) on Thu Sep  5 01:40:26 2024\n*mangle\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:FORWARD ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\n:KUBE-IPTABLES-HINT - [0:0]\n:KUBE-KUBELET-CANARY - [0:0]\n:KUBE-PROXY-CANARY - [0:0]\n:cali-POSTROUTING - [0:0]\n:cali-PREROUTING - [0:0]\n:cali-failsafe-out - [0:0]\n:cali-from-host-endpoint - [0:0]\n:cali-po-_VKFJOaMnQY1bqrOtkS0 - [0:0]\n:cali-th-ens3 - [0:0]\n:cali-to-host-endpoint - [0:0]\n-A PREROUTING -m comment --comment \"cali:6gwbT8clXdHdC1b1\" -j cali-PREROUTING\n-A POSTROUTING -m comment --comment \"cali:O3lYWMrLQYEMJtB5\" -j cali-POSTROUTING\n-A cali-POSTROUTING -m comment --comment \"cali:NX-7roTexQ3fGRfU\" -m mark --mark 0x10000/0x10000 -j RETURN\n-A cali-POSTROUTING -m comment --comment \"cali:nnqPh8lh2VOogSzX\" -j MARK --set-xmark 0x0/0xf0000\n-A cali-POSTROUTING -m comment --comment \"cali:nquN8Jw8Tz72pcBW\" -m conntrack --ctstate DNAT -j cali-to-host-endpoint\n-A cali-POSTROUTING -m comment --comment \"cali:jWrgvDQ0xEZHmta3\" -m comment --comment \"Host endpoint policy accepted packet.\" -m mark --mark 0x10000/0x10000 -j RETURN\n-A cali-PREROUTING -m comment --comment \"cali:6BJqBjBC7crtA-7-\" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT\n-A cali-PREROUTING -m comment --comment \"cali:KX7AGNd6rMcDUai6\" -m mark --mark 0x10000/0x10000 -j ACCEPT\n-A cali-PREROUTING -m comment --comment \"cali:wNH7KsA3ILKJBsY9\" -j cali-from-host-endpoint\n-A cali-PREROUTING -m comment --comment \"cali:Cg96MgVuoPm7UMRo\" -m comment --comment \"Host endpoint policy accepted packet.\" -m mark --mark 0x10000/0x10000 -j ACCEPT\n-A cali-po-_VKFJOaMnQY1bqrOtkS0 -m comment --comment \"cali:MggzujhBJtoKN_-K\" -m comment --comment \"Policy default.allow-all-private-default egress\" -j MARK --set-xmark 0x10000/0x10000\n-A cali-po-_VKFJOaMnQY1bqrOtkS0 -m comment --comment \"cali:hY7beqELirCfBLs8\" -m mark --mark 0x10000/0x10000 -j RETURN\n-A cali-th-ens3 -m comment --comment \"cali:V3Ohfo9qw_gquX5N\" -m conntrack --ctstate RELATED,ESTABLISHED -j MARK --set-xmark 0x10000/0x10000\n-A cali-th-ens3 -m comment --comment \"cali:Bj_9MwISgJq3LlPv\" -m conntrack --ctstate RELATED,ESTABLISHED -j RETURN\n-A cali-th-ens3 -m comment --comment \"cali:UDIShtSW1uDudgkc\" -m conntrack --ctstate INVALID -j DROP\n-A cali-th-ens3 -m comment --comment \"cali:iRjWYuKXg7ne5Qj7\" -j cali-failsafe-out\n-A cali-th-ens3 -m comment --comment \"cali:1PHCAuAM_oNbSTF_\" -j MARK --set-xmark 0x0/0x10000\n-A cali-th-ens3 -m comment --comment \"cali:lM0HTYTcA7IbhjA4\" -m comment --comment \"Start of policies\" -j MARK --set-xmark 0x0/0x20000\n-A cali-th-ens3 -m comment --comment \"cali:wv1xQrAz7wMxFStZ\" -m mark --mark 0x0/0x20000 -j cali-po-_VKFJOaMnQY1bqrOtkS0\n-A cali-th-ens3 -m comment --comment \"cali:wBYa83dIffxEfmm2\" -m comment --comment \"Return if policy accepted\" -m mark --mark 0x10000/0x10000 -j RETURN\n-A cali-th-ens3 -m comment --comment \"cali:YN5meeJ0xo6rkQTH\" -m comment --comment \"Drop if no policies passed packet\" -m mark --mark 0x0/0x20000 -j DROP\n-A cali-th-ens3 -m comment --comment \"cali:JNZU3B4pEesV-4Bs\" -m comment --comment \"Drop if no profiles matched\" -j DROP\n-A cali-to-host-endpoint -o ens3 -m comment --comment \"cali:mv1QbpxXnvBbq5st\" -g cali-th-ens3\nCOMMIT\n# Completed on Thu Sep  5 01:40:26 2024\n" table="mangle"
2024-09-05 01:40:26.953 [PANIC][3796] felix/table.go 1047: Failed to program iptables, giving up after retries error=writting out buffer: exit status 4 ipVersion=0x4 table="mangle"
2024-09-05 01:40:26.953 [INFO][3796] felix/table.go 980: Updating iptables took >1s applyTime=1.450371887s reasonForApply=""
panic: (*logrus.Entry) 0xc000752bd0

goroutine 191 [running]:
github.com/sirupsen/logrus.(*Entry).log(0xc000752b60, 0x0, {0xc000afc800, 0x33})
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:260 +0x491
github.com/sirupsen/logrus.(*Entry).Log(0xc000752b60, 0x0, {0xc000b0f5d8?, 0x5?, 0x1388dcf91?})
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:304 +0x48
github.com/sirupsen/logrus.(*Entry).Panic(...)
	/go/pkg/mod/github.com/sirupsen/[email protected]/entry.go:342
github.com/projectcalico/calico/felix/iptables.(*Table).Apply(0xc000000240)
	/go/src/github.com/projectcalico/calico/felix/iptables/table.go:1047 +0xc9d
github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply.func4(0xc0002bc800?)
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2240 +0x4c
created by github.com/projectcalico/calico/felix/dataplane/linux.(*InternalDataplane).apply in goroutine 46
	/go/src/github.com/projectcalico/calico/felix/dataplane/linux/int_dataplane.go:2239 +0x1306

The mesh itself still appeared to be healthy:

weiner ~ % kubectl exec -it -n calico-system calico-node-g2b9h -c calico-node -- /bin/birdcl -s /var/run/calico/bird.ctl show protocol
BIRD v0.3.3+birdv1.6.8 ready.
name     proto    table    state  since       info
static1  Static   master   up     01:38:55
kernel1  Kernel   master   up     01:38:55
device1  Device   master   up     01:38:55
direct1  Direct   master   up     01:38:55
Mesh_10_240_0_39 BGP      master   up     01:38:57    Established
weiner ~ % kubectl exec -it -n calico-system calico-node-q8xj4 -c calico-node -- /bin/birdcl -s /var/run/calico/bird.ctl show protocol
BIRD v0.3.3+birdv1.6.8 ready.
name     proto    table    state  since       info
static1  Static   master   up     01:38:43
kernel1  Kernel   master   up     01:38:43
device1  Device   master   up     01:38:43
direct1  Direct   master   up     01:38:43
Mesh_10_240_0_40 BGP      master   up     01:38:57    Established
  1. If you edit the above GNP with the following change and apply it, the calico-node pods will stop crashing as they can update the iptables rules:
    source:
      notNets:
      - 1.1.1.1/32

Context

The use case here is a service that has an allow list feature.

By default, they would like a GNP to allow all traffic in, and then as the customer adds a list of CIDRs that should only be allowed to access this resource, the 0.0.0.0/0 CIDR can be replaced with those specified by the user.

Your Environment

  • Calico version: v3.27.4
  • Orchestrator version (e.g. kubernetes, mesos, rkt): OpenShift 4.15 (Kubernetes 1.28)
  • Operating System and version: RedHat CoreOS or RHEL 8.10
  • Link to your project (optional):
weiner ~ % cali version
Client Version:    v3.26.1
Git commit:        b1d192c95
Cluster Version:   v3.27.4
Cluster Type:      k8s,operator,openshift,bgp,kdd,typha
weiner ~ % k get nodes -A -o wide
NAME                                                     STATUS   ROLES           AGE   VERSION            INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                 CONTAINER-RUNTIME
<REDACTED>                                               Ready    master,worker   11m   v1.28.12+396c881   10.240.0.39   10.240.0.39   Red Hat Enterprise Linux CoreOS 415.92.202408100433-0 (Plow)   5.14.0-284.79.1.el9_2.x86_64   cri-o://1.28.9-5.rhaos4.15.git674ed4c.el9
<REDACTED>                                               Ready    master,worker   10m   v1.28.12+396c881   10.240.0.40   10.240.0.40   Red Hat Enterprise Linux CoreOS 415.92.202408100433-0 (Plow)   5.14.0-284.79.1.el9_2.x86_64   cri-o://1.28.9-5.rhaos4.15.git674ed4c.el9
weiner ~ % kubectl exec -it -n calico-system calico-node-q8xj4 -c calico-node -- iptables -V
iptables v1.8.8 (legacy)

mike-weiner avatar Sep 05 '24 19:09 mike-weiner

@mike-weiner Were you trying to DENY all IPv4 and to allow only IPv6?

sridhartigera avatar Sep 12 '24 15:09 sridhartigera

@mike-weiner Were you trying to DENY all IPv4 and to allow only IPv6?

@sridhartigera - No, IPv4 only.

The use case was that this policy could be applied by default to allow all traffic.

Customers can then create an allowlist for a service so only certain IPs can be allowed inbound. We would then replace 0.0.0.0/0 with the list of IPs provided by the customer that should be allowed inbound.

mike-weiner avatar Sep 13 '24 00:09 mike-weiner

This also seems to affect Ubuntu 24 workers running Calico 3.28.2, although the behavior is slightly different in that the calico-node pods don't crash, they just don't get to "Ready" state, and they likely aren't applying any network policies, due to messages in the calico-node pod logs like:

2025-04-17T20:14:15.381478564Z 2025-04-17 20:14:15.374 [WARNING][64734] felix/table.go 1440: Failed to execute ip(6)tables-restore command error=exit status 4 errorOutput="iptables-nft-restore v1.8.8 (nf_tables): 
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_eeRzcfLU0LBkSlw-Stu
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_K-3J0urBdKU8zP_Dr9F
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_V3NRX2XFvIl0OkLIKmg
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_s-o2YDOA_30XCIrfsw8
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_WZPO4m4V4N-MIlebxLX
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_Mft4nVnZMVuKCQu7Why
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_7fpvkzsEwR8oVavP3R3
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_77w3m7Cr4Io-8YNIb08
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_1du5rYGjSDSAe8eCb11
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_K1IxYd44joewPlFT4Uu
line 2591: RULE_APPEND failed (Invalid argument): rule in chain cali-pi-_pKtmI395FNn06kNFg7Y
" input="*mangle
:cali-pi-_fQHe7cLg4hdhj8-5z1c - -
:cali-pi-_Q2wVBKOqBA1UWBVeFuP - -
:cali-pi-_2-EWK4EuRUizygMcQxX - -
:cali-pi-_UdlZMFyyE4GDueXRg_p - -
...
-A cali-pi-_eeRzcfLU0LBkSlw-Stu -m comment --comment \"cali:PISm3FPduioJmjeY\" -m comment --comment \"Policy default.14791fa9-b3eb-4a0f-a610-dd088f30734d ingress\" -p tcp -m multiport --destination-ports 30111,31111 ! --source 0.0.0.0/0 --jump DROP
-A cali-pi-_K-3J0urBdKU8zP_Dr9F -m comment --comment \"cali:LPqwG0PrDvi9aInx\" -m comment --comment \"Policy default.1ceffa99-5776-463e-a862-dd7ab581185f ingress\" -p tcp -m multiport --destination-ports 30211,31211 ! --source 0.0.0.0/0 --jump DROP
-A cali-pi-_V3NRX2XFvIl0OkLIKmg -m comment --comment \"cali:MMmOjdQvob3HVNTm\" -m comment --comment \"Policy default.cc47e70a-fad6-455b-8bf0-fbb2c1f5badd ingress\" -p tcp -m multiport --destination-ports 30311,31311 ! --source 0.0.0.0/0 --jump DROP
-A cali-pi-_s-o2YDOA_30XCIrfsw8 -m comment --comment \"cali:uD6nLS0TAEhmCVbb\" -m comment --comment \"Policy default.d9be0c59-05cc-43bb-a72c-e8d2b57cb244 ingress\" -p tcp -m multiport --destination-ports 30411,31411 ! --source 0.0.0.0/0 --jump DROP
-A cali-pi-_WZPO4m4V4N-MIlebxLX -m comment --comment \"cali:u1KQdQvCEhdnGAz7\" -m comment --comment \"Policy default.f778ba06-7202-47a3-91fb-ca57094f77c7 ingress\" -p tcp -m multiport --destination-ports 30511,31611 ! --source 0.0.0.0/0 --jump DROP
...

bradbehle avatar Apr 17 '25 20:04 bradbehle

I think this is probably evident to many who read this issue, but I just add here what I came up with as I started looking into this a bit deeper:

When current versions of Calico process this problematic GNP, more specifically when having

    source:
      notNets:
      - 0.0.0.0/0

in the GNP, we ultimately end up with an iptables command that will contain ! --source 0.0.0.0/0. And this causes the issue, because it tells nf_tables to match packets that are "not from any source", which is an obvious logical contradiction(in that it can not be applied to any packet) and thus the rule gets rejected:

$ iptables -I INPUT 1 -p tcp ! --source 0.0.0.0/0 -j ACCEPT
iptables v1.8.10 (nf_tables):  RULE_INSERT failed (Invalid argument): rule in chain INPUT

dzacball avatar May 09 '25 18:05 dzacball

I opened a slack thread about this. I copy a part of that post here as well:

I'm trying to understand how we could defend calico from these types of crashes. I guess we could

  1. Use calico-apiserver to reject policies that contain such entries
  2. Make felix defensively ignore such rules. Since these rules would never match any packets, this would effectively not change anything in how we handle traffic in our dataplane

Now as 1. would be quite invasive and potentially breaking, I'm not sure we'd want to have something like that implemented.. and it also of course wouldn't cover for policies placed through the Kube API. Moreover, I'm not even sure we'd need 1., because I think if we properly ignore these rules when rendering our iptables commands, the presence of such policies can do no harm (and no confusion for anyone, since the iptables rules they'd be translated to would be a no op anyways). Here is a commit I put together just to start a discussion with a possible implementation for both 1. and 2. Please let me know what you think!

dzacball avatar Jun 16 '25 14:06 dzacball

I suspect we would ultimately want both of these - our usual strategy is to both validate inputs (apiserver + calicoctl) and also to ensure that bad input won't crash Felix if it somehow gets by our checks.

caseydavenport avatar Jun 24 '25 17:06 caseydavenport

Looks like @coutinhop has already got that PR covered 👍

caseydavenport avatar Jun 24 '25 17:06 caseydavenport

Looks like @coutinhop has already got that PR covered 👍

Yes, @dzacball has a PR in-flight that will fix this (Thanks @dzacball btw!)

coutinhop avatar Jun 25 '25 15:06 coutinhop