False "No space left on device" error
What happened? Getting error "No space left on device" in case, when trying to sync large directories (over 700mb) from local machine to kubernetes pod.
What did you expect to happen instead? Should sync :)
How can we reproduce the bug? (as minimally and precisely as possible) make image with working directory size over 700 mb and deploy it to google cloud.
Local Environment:
- DevSpace Version: 5.14.0 - 5.14.3 (tried to change)
- Operating System: linux
- Deployment method: helm
Kubernetes Cluster:
- Cloud Provider: google
- Kubernetes Version: Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.6-gke.1000", GitCommit:"3ae0998c5052f420a17cb96bacf860ec5d6822a3", GitTreeState:"clean", BuildDate:"2021-04-29T09:17:16Z", GoVersion:"go1.15.10b5", Compiler:"gc", Platform:"linux/amd64"}
Anything else we need to know?
- no node or other cloud limitations, related to diskspace found.
-
df -hat the pod shows a lot of free diskspace as well as free inodes (df -i) - removing some files to reduce directory size under 700 mb solves the problem
/kind bug
@oleksandr-vynnyk thanks for creating this issue! Could you provide the complete log output of DevSpace when this issue occurs? Does the node where the pod is running on have disk pressure?
@FabianKramm thanks for quick feedback. Here is complete devspace output:
devspace sync --pod mypod-5f9556c8b8-rwdwg -n my-namespace --debug
[warn] If you want to use the sync paths from `devspace.yaml`, use the `--config=devspace.yaml` flag for this command.
[info] Using namespace 'my-namespace'
[info] Using kube context 'my-context'
[info] Waiting for pods...
[info] Starting sync...
[info] Inject devspacehelper into pod my-namespace/mypod-5f9556c8b8-rwdwg
[info] Start syncing
[done] √ Sync started on /home/me/dev/my/proj <-> . (Pod: my-namespace/mypod-5f9556c8b8-rwdwg)
[error] Sync Error on /home/me/dev/my/proj: Sync - connection lost to pod my-namespace/mypod-5f9556c8b8-rwdwg: 2021/07/09 10:58:07 error watching path /opt/service: error while traversing /opt/service/public/assets/js/plugin/flot: no space left on device
command terminated with exit code 1
[info] Sync stopped
This is permanent behavior. It doesn't depend on disk load etc. Error is gone only after removing of some files in /opt/service directory.
@oleksandr-vynnyk thanks for the information! Does this also happen if you exclude some of the folders via excludePaths? Are you running on arm architecture?
@FabianKramm no, it does not depend on exclusions. But, when I remove some files/directories at the pod and resync it from local, it stops syncing, when remote size reaches 700MB.
@oleksandr-vynnyk I see, I think the problem here is actually not the space on the device and rather the amount of inotify watches that is allowed, you can check this by using polling instead of inotify for syncing:
dev:
sync:
- ...
polling: true
@FabianKramm Why are there io watches set up for files in the directories explicitly removed by excludePaths? Is there some other way of excluding paths from ever getting watchers?
@artushin the problem was the underlying notify library we were using that was not supporting this. We just released DevSpace version v5.14.4 where this should be fixed and which should not set any inotify watches anymore for excluded folders in the container.
@FabianKramm Awesome, thanks, will test later today!
@FabianKramm unfortunately, it does not work. Got the same behavior after adding this option
Hello, We have the same issue with our Operating System Linux Ubuntu 20.04/
[error] Sync Error on /home/florian/workspace/japi: error while traversing /home/florian/xxxxxxx : no space left on device
[info] Sync stopped
[fatal] initial sync: error while traversing /home/florian/workspace/xxxxxxxxx: no space left on device
Prior the version v5.15.0, the "no space left on device" error was already there but it was not blocking the sync as the sync restarted. Now, it is blocking all the sync after the error.
We got this error with smaller project (20MB) and we have checked the inodes and disk space, everything is ok.
I have increased this parameter fs.inotify.max_user_watches of my sysctl configuration on my Linux workstation and it works now. @oleksandr-vynnyk , have you tried this?
facing this issue with 5.18.4 we didn't have this issue in versions prior to 5.18
I have increased this parameter fs.inotify.max_user_watches of my sysctl configuration on my Linux workstation and it works now. @oleksandr-vynnyk , have you tried this?
Thank you very much. It works.
FYI, to expand on what @christos-P mentioned:
We hit this recently as well and were able to work around it thus:
Check the local machine's limit for watches:
cat /proc/sys/fs/inotify/max_user_watches`
Raise the local machine's limit to something absurd:
sudo sh -c "echo fs.inotify.max_user_watches=524288 >> /etc/sysctl.conf"
sudo sysctl -p
The error is very confusing because:
- It actually has nothing to do with disk space at all, just the C errorno got reused.
- It is about the local Linux machine (i.e. your laptop) and not about anything inside the Kube cluster you're trying to connect to.
- It doesn't happen for folks on Mac laptops, as there's a different file monitoring method used there