Reachability of NFS-backend should be taken in account during pod scheduling
Description of current situation A pod with a nfs-PVC attached gets deployed on a node, even though the node can't reach the Netapp backend. Instead of a scheduling error (which should be expected), the pod starts successfully on this node. An empty host directory (read only) gets mounted to the actual mount path.
Environment Provide accurate information about the environment to help us reproduce the issue.
- Trident version: 22.01
- Container runtime: crio
- Kubernetes version: 1.22
- Kubernetes orchestrator: any OCP 4.x
To Reproduce Steps to reproduce the behavior:
- set up a cluster (ocp 4.9.x) with a single worker node.
- install trident and create a simple NFS storage class.
- create a PVC using the NFS storage class.
- set up a firewall rule which prevents connections from worker-1 to the Netapp (NFS backend).
- start a pod with this newly created PVC on worker-1.
Result Pod will start and mount a read only directory to the actual mount path where the NFS-share should be mounted instead.
Expected behavior
- Pod will stay in a "Pending or Failure" state because the NFS storage backend cannot be reached from worker-1.
- Pod shouldn't mount a host directory instead.
Additional context I don't face this issue when using iSCSI, because there the CHAP auth will fail prior to the mount. This will prevent the scheduler from placing a pod on worker-1. I assume you would face the same issue when using iSCSI without CHAP.
Prior to this fix (https://github.com/NetApp/trident/issues/572) I wasn't even able to shut down a pod in such a scenario. Which makes perfectly sense ... a read only host directory cant get released/unmounted.
Hi @phhutter
Trident does not control how pods are scheduled. You should be able to use NodeSelectors to influence where the pod can be deployed. It is a pre-requisite to ensure network connectivity to the storage controller from the nodes that need access to volumes.
If you would like to establish boundaries, CSI Topology is worth looking into. This will allow you to create regions and zones, as well as restrict access to backends. In addition, waitForFirstConsumer is another toggle that will help create volumes only when they are needed, i.e, a consumer is first created.
Closing this issue as the Kubernetes scheduler decides where pods are scheduled. Taints and tolerations can be applied to Kubernetes nodes to control where pods are scheduled.