Derek Su
Derek Su
Assume the failover time should be smaller than 30 seconds. There are several factors we need to consider in order to achieve the NFS-HA. - Default kubernetes `node-monitor-grace-period` is 40...
> > * Default kubernetes `node-monitor-grace-period` is 40 seconds. It means the node status is set to`down`/`not ready` when the node is down for 40 seconds. Thus, the value should...
**Update** - node-monitor-grace-period of kubernetes: 40 seconds (default value) - pre-pulled the share-manager image on each node by manual (we should implement a share-manager image controller for the pre-pull) -...
According to the discussion in today's PR review, I did the following changes in longhorn-manager and longhorn-share-manager for PoC. - Replace NFS client option `soft, timeo=30, retrans=3` with `hard` -...
A NFS recovery backend records the client id when a client connects the NFS server, and it will be utilized for the recovery process after the server is crashed and...
@innobead It needs to be compiled into the nfs-ganesha executable at compile time. The nfs-server is running in the share-mamager pod, so I'm still thinking how to access the in-cluster...
- The hostname of the share-manager pod is already the pod name and won't be changed after failovering to another node. - CIFS does not support symbolic link operation, so...
Update the work on https://github.com/longhorn/longhorn/issues/2293#issuecomment-1193494986: `fs_ng` recovery backend stores the client information by the format `::ffff:${client IPv4}-(31:Linux NFSv4.1 ${node name})`. My environemt is k3s with flannel CNI. - When the...
> longhorn/longhorn-share-manager@9e021ac No problem. My current nfs-ganesha is based on https://github.com/longhorn/nfs-ganesha/commit/1c27c2b8667b849c0b1b1d8fb718a736ddec393f. I will update the nfs-ganesha version as well.
> @derekbit great job on the investigation and testing :) > > I wonder though if the connection was via `cni0` before the failure, what happens to the connection after...