talos icon indicating copy to clipboard operation
talos copied to clipboard

NFSv4.1 and NFSv4.2 mount fails on Talos v1.9.x and v1.10.x (works on v1.8.x)

Open c3y1huang opened this issue 1 year ago • 1 comments

Bug Report

Description

NFSv4 mounts (both v4.1 and v4.2) are failing with an Input/output error on Talos v1.9.x and v1.10.x. The same NFS export configuration (served via nfs-ganesha) works correctly when mounted from a Talos v1.8.x node.

To confirm the issue is specific to NFSv4.1 and NFSv4.2, we tested NFSv3 and NFSv4.0 mounts on the newer Talos versions, which completed successfully.

Logs

Mount Error:

> mount -vvv -t nfs -o nfsvers=4.2 longhorn-test-nfs-svc.default:/opt/backupstore /mnt/nfs
mount.nfs: timeout set for Fri May  9 01:46:31 2025
mount.nfs: trying text-based options 'nfsvers=4.2,addr=10.244.2.15,clientaddr=10.244.1.6'
mount.nfs: mount(2): Input/output error
mount.nfs: mount system call failed for /mnt/nfs

Relevant nfs-ganesha Log:

09/05/2025 00:40:37 : epoch 681d4a35 : longhorn-test-nfs-84bff47977-khct5 : ganesha.nfsd-185[svc_16] complete_op :NFS4 :DEBUG :Status of OP_GETATTR in position 2 = NFS4ERR_BADXDR, op response size is 4 total response size is 92
09/05/2025 00:40:37 : epoch 681d4a35 : longhorn-test-nfs-84bff47977-khct5 : ganesha.nfsd-185[svc_16] complete_nfs4_compound :NFS4 :DEBUG :End status = NFS4ERR_BADXDR lastindex = 3

Talos Kernel Versions:

  • v1.8.4:
    > talosctl --talosconfig talosconfig -n 52.221.241.146 read /proc/version
    Linux version 6.6.64-talos (@buildkitsandbox) (gcc (GCC) 13.3.0, GNU ld (GNU Binutils) 2.43.1) #1 SMP Wed Dec 11 15:33:07 UTC 2024
    
  • v1.9.5:
    > talosctl --talosconfig talosconfig -n 18.140.63.63 read /proc/version
    Linux version 6.12.18-talos (@buildkitsandbox) (gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.43.1) #1 SMP Tue Mar 11 15:15:12 UTC 2025
    
  • v1.10.1:
    > talosctl --talosconfig talosconfig -n 13.213.71.88 read /proc/version
    Linux version 6.12.25-talos (root@buildkitsandbox) (gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.44) #1 SMP Fri May  2 16:10:01 UTC 2025
    

Environment

  • Talos version: [talosctl version --nodes <problematic nodes>] v1.9.5
  • Kubernetes version: [kubectl version] v1.32.3
  • Platform: AWS

c3y1huang avatar May 09 '25 04:05 c3y1huang

We're hitting this as well, looks like this (and this). In our case the backend is vSAN. Nothing Talos can do about it.

robinelfrink avatar Jun 05 '25 12:06 robinelfrink