linkerd2 icon indicating copy to clipboard operation
linkerd2 copied to clipboard

can't change directory to '/lib/modules when installing on Kubernetes v1.24.1

Open choclo opened this issue 3 years ago • 5 comments

What is the issue?

I'm trying to get LinkerD installed on a CentOS 8_Stream cluster with Kubernetes v1.24.1 and pods cannot initialize and upon describe it's outputting the following:

      Reason:    CrashLoopBackOff
    Last State:  Terminated
      Reason:    Error
      Message:   time="2022-06-01T10:14:41Z" level=info msg="iptables-save -t nat"
time="2022-06-01T10:14:41Z" level=info msg="modprobe: can't change directory to '/lib/modules': No such file or directory\niptables-save v1.8.7 (legacy): Cannot initialize: iptables who? (do you need to insmod?)\n\n"
time="2022-06-01T10:14:41Z" level=error msg="aborting firewall configuration"
Error: exit status 1
Usage:

How can it be reproduced?

Installing Kubernetes v1.24.1 with projectCalico and CentOS Stream 8 after install.

Logs, error output, etc

  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  2m42s                 default-scheduler  Successfully assigned linkerd/linkerd-destination-796cf7c454-slcf5 to k8s-wn1
  Normal   Pulled     77s (x5 over 2m41s)   kubelet            Container image "cr.l5d.io/linkerd/proxy-init:v1.5.3" already present on machine
  Normal   Created    77s (x5 over 2m41s)   kubelet            Created container linkerd-init
  Normal   Started    77s (x5 over 2m41s)   kubelet            Started container linkerd-init
  Warning  BackOff    51s (x10 over 2m39s)  kubelet            Back-off restarting failed container
Linkerd core checks
===================

linkerd-existence
-----------------
× control plane pods are ready
    No running pods for "linkerd-destination"
    see https://linkerd.io/2.11/checks/#l5d-api-control-ready for hints

Status check results are ×
      Reason:    Error
      Message:   time="2022-06-01T10:14:41Z" level=info msg="iptables-save -t nat"
time="2022-06-01T10:14:41Z" level=info msg="modprobe: can't change directory to '/lib/modules': No such file or directory\niptables-save v1.8.7 (legacy): Cannot initialize: iptables who? (do you need to insmod?)\n\n"
time="2022-06-01T10:14:41Z" level=error msg="aborting firewall configuration"
Error: exit status 1

output of linkerd check -o short

linkerd check -o short
Linkerd core checks
===================

linkerd-existence
-----------------
\ No running pods for "linkerd-destination"

Environment

  • Kubernetes v1.24.1
  • Manual K8S installation
  • CentOS Stream 8
  • Client version: stable-2.11.2
  • Server version: stable-2.11.2

Possible solution

No response

Additional context

No response

Would you like to work on fixing this bug?

No response

choclo avatar Jun 01 '22 10:06 choclo

I just saw this is a dupe of issue #7749

choclo avatar Jun 01 '22 10:06 choclo

@choclo are you running with selinux turned on for your hosts? I think I might've figured out how to get our iptables init container to work with selinux.

First, understanding the problem: afaik, when running DNAT through netfilter, certain kernel modules need to be loaded (e.g nat_redirect and/or ipt_redirect modules). On selinux hosts, it seems that these can't be loaded by proxy-init container. A workaround seems to be allowing the container to run in privileged mode. When running with privileged=true, the container will have the same capabilities and permissions as any other process running on the host -- including the ability to load kernel modules.

If you're running with PSPs, you might also need to have something similar to this:

  seLinux:
    rule: RunAsAny

Currently, proxy-init does not support setting privileged=true. It's a bit difficult for me to get a host up and running on an OS that has selinux, so I can't easily reproduce this. If you can test everything works by following the instructions above, I'd be super open to making this configurable so we can fix this.

Steps you'd need to take at the moment to enable privileged mode:

  • either inject a resource using linkerd inject --manual > random.yaml and then manually modify the initContainer spec to run as privileged;
  • or set Values.proxyInit.closeWaitTimeoutSecs when installing linkerd -- closeWaitTimeoutSecs requires the initContainer to run as privileged so that'd be set for you out of the box.

It's possible setting privileged=true alone won't work. In that case, I'd recommend also running the initContainer as root. I'm fairly confident it should work when it runs as root and as privileged.

If you can confirm these things for me, we might be able to close this out without a lot of effort :) Alternatively, have a look at the cni plugin, it should be able to set-up routing rules without an initContainer.

mateiidavid avatar Jun 01 '22 11:06 mateiidavid

I also have the same issue with kubernetes 1.24.1

I did try setting up closeWaitTimeoutSecs: 5 in linkerd-config configmap and privileged: true in linkerd-destination deployment but still seeing below error.

    proxyInit:
      capabilities: null
      closeWaitTimeoutSecs: 5
      ignoreInboundPorts: 4567,4568
      ignoreOutboundPorts: 4567,4568
      image:
        name: cr.l5d.io/linkerd/proxy-init
        pullPolicy: ""
        version: v1.5.3
      Message:   time="2022-06-18T04:30:06Z" level=info msg="iptables-save -t nat"
time="2022-06-18T04:30:06Z" level=info msg="modprobe: can't change directory to '/lib/modules': No such file or directory\niptables-save v1.8.7 (legacy): Cannot initialize: iptables who? (do you need to insmod?)\n\n"
time="2022-06-18T04:30:06Z" level=error msg="aborting firewall configuration"
Error: exit status 1
Usage:
  proxy-init [flags]

Flags:
  -h, --help                               help for proxy-init
      --inbound-ports-to-ignore strings    Inbound ports and/or port ranges (inclusive) to ignore and not redirect to proxy. This has higher precedence than any other parameters.
  -p, --incoming-proxy-port int            Port to redirect incoming traffic (default -1)
      --log-format string                  Configure log format ('plain' or 'json') (default "plain")
      --log-level string                   Configure log level (default "info")
      --netns string                       Optional network namespace in which to run the iptables commands
      --outbound-ports-to-ignore strings   Outbound ports and/or port ranges (inclusive) to ignore and not redirect to proxy. This has higher precedence than any other parameters.
  -o, --outgoing-proxy-port int            Port to redirect outgoing traffic (default -1)
  -r, --ports-to-redirect ints             Port to redirect to proxy, if no port is specified then ALL ports are redirected
  -u, --proxy-uid int                      User ID that the proxy is running under. Any traffic coming from this user will be ignored to avoid infinite redirection loops. (default -1)
      --simulate                           Don't execute any command, just print what would be executed
      --subnets-to-ignore strings          Subnets to ignore and not redirect to proxy. This has higher precedence than any other parameters.
      --timeout-close-wait-secs int        Sets nf_conntrack_tcp_timeout_close_wait
  -w, --use-wait-flag                      Appends the "-w" flag to the iptables commands

Same message is coming to all 3 pods linkerd-destination , linkerd-identity, linkerd-proxy-injector

NAME                                      READY   STATUS                  RESTARTS         AGE
linkerd-destination-5576cf4d4d-zx4v5      0/4     Init:CrashLoopBackOff   42 (3m39s ago)   10h
linkerd-identity-685dc4fd66-zmlz4         0/2     Init:CrashLoopBackOff   42 (3m26s ago)   10h
linkerd-proxy-injector-7f88c45487-lmqxg   0/2     Init:CrashLoopBackOff   42 (3m25s ago)   10h

I'm using calico CNI

[root@lp-k8control-1 ~]# cat /etc/cni/net.d/10-calico.conflist 
{
  "name": "k8s-pod-network",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "calico",
      "log_level": "info",
      "log_file_path": "/var/log/calico/cni/cni.log",
      "datastore_type": "kubernetes",
      "nodename": "lp-k8control-1.home",
      "mtu": 0,
      "ipam": {
          "type": "calico-ipam"
      },
      "policy": {
          "type": "k8s"
      },
      "kubernetes": {
          "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
      }
    },
    {
      "type": "portmap",
      "snat": true,
      "capabilities": {"portMappings": true}
    },
    {
      "type": "bandwidth",
      "capabilities": {"bandwidth": true}
    }
  ]
}

Environment

Kubernetes v1.24.1 Manual K8S installation Rocky Linux 8.5 kernel version : 4.18.0-348.el8.0.2.x86_64 iptables version : iptables v1.8.4 (nf_tables) SELinux : permissive

Linkerd Client version: stable-2.11.2 Linkerd Server version: stable-2.11.2

gaganyaan2 avatar Jun 18 '22 04:06 gaganyaan2

Current workaround is to install linkerd-cni plugin.

linkerd install-cni | kubectl apply -f -
linkerd upgrade --linkerd-cni-enabled | kubectl apply -f -

After installing linkerd-cni All pods came into Running state

https://linkerd.io/2.11/features/cni/

gaganyaan2 avatar Jun 18 '22 18:06 gaganyaan2

@koolwithk ahh I see. RockyLinux is derived from RHEL, right? In that case, the issue here is not related to module permissions (as I initially thought) but to the fact that RHEL-based distributions do not actually have support for iptables-legacy. I was curious if OP had SELinux on because that's a completely different issue. We're considering to switch to nftables, that should fix issues with the init container on RHEL envs.

mateiidavid avatar Jun 30 '22 15:06 mateiidavid

This is now documented.

kleimkuhler avatar Sep 01 '22 16:09 kleimkuhler