devspace icon indicating copy to clipboard operation
devspace copied to clipboard

False "No space left on device" error

Open oleksandr-vynnyk opened this issue 4 years ago • 14 comments

What happened? Getting error "No space left on device" in case, when trying to sync large directories (over 700mb) from local machine to kubernetes pod.

What did you expect to happen instead? Should sync :)

How can we reproduce the bug? (as minimally and precisely as possible) make image with working directory size over 700 mb and deploy it to google cloud.

Local Environment:

  • DevSpace Version: 5.14.0 - 5.14.3 (tried to change)
  • Operating System: linux
  • Deployment method: helm

Kubernetes Cluster:

  • Cloud Provider: google
  • Kubernetes Version: Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.6-gke.1000", GitCommit:"3ae0998c5052f420a17cb96bacf860ec5d6822a3", GitTreeState:"clean", BuildDate:"2021-04-29T09:17:16Z", GoVersion:"go1.15.10b5", Compiler:"gc", Platform:"linux/amd64"}

Anything else we need to know?

  • no node or other cloud limitations, related to diskspace found.
  • df -h at the pod shows a lot of free diskspace as well as free inodes (df -i)
  • removing some files to reduce directory size under 700 mb solves the problem

/kind bug

oleksandr-vynnyk avatar Jul 09 '21 09:07 oleksandr-vynnyk

@oleksandr-vynnyk thanks for creating this issue! Could you provide the complete log output of DevSpace when this issue occurs? Does the node where the pod is running on have disk pressure?

FabianKramm avatar Jul 09 '21 10:07 FabianKramm

@FabianKramm thanks for quick feedback. Here is complete devspace output:

devspace sync --pod mypod-5f9556c8b8-rwdwg -n my-namespace --debug
[warn]   If you want to use the sync paths from `devspace.yaml`, use the `--config=devspace.yaml` flag for this command.
[info]   Using namespace 'my-namespace'
[info]   Using kube context 'my-context'
[info]   Waiting for pods...
[info]   Starting sync...
[info]   Inject devspacehelper into pod my-namespace/mypod-5f9556c8b8-rwdwg
[info]   Start syncing
[done] √ Sync started on /home/me/dev/my/proj <-> . (Pod: my-namespace/mypod-5f9556c8b8-rwdwg)
[error]  Sync Error on /home/me/dev/my/proj: Sync - connection lost to pod my-namespace/mypod-5f9556c8b8-rwdwg: 2021/07/09 10:58:07 error watching path /opt/service: error while traversing /opt/service/public/assets/js/plugin/flot: no space left on device
 command terminated with exit code 1
[info]   Sync stopped

This is permanent behavior. It doesn't depend on disk load etc. Error is gone only after removing of some files in /opt/service directory.

oleksandr-vynnyk avatar Jul 09 '21 11:07 oleksandr-vynnyk

@oleksandr-vynnyk thanks for the information! Does this also happen if you exclude some of the folders via excludePaths? Are you running on arm architecture?

FabianKramm avatar Jul 12 '21 07:07 FabianKramm

@FabianKramm no, it does not depend on exclusions. But, when I remove some files/directories at the pod and resync it from local, it stops syncing, when remote size reaches 700MB.

oleksandr-vynnyk avatar Jul 12 '21 09:07 oleksandr-vynnyk

@oleksandr-vynnyk I see, I think the problem here is actually not the space on the device and rather the amount of inotify watches that is allowed, you can check this by using polling instead of inotify for syncing:

dev:
  sync:
  - ...
    polling: true

FabianKramm avatar Jul 12 '21 09:07 FabianKramm

@FabianKramm Why are there io watches set up for files in the directories explicitly removed by excludePaths? Is there some other way of excluding paths from ever getting watchers?

artushin avatar Jul 19 '21 21:07 artushin

@artushin the problem was the underlying notify library we were using that was not supporting this. We just released DevSpace version v5.14.4 where this should be fixed and which should not set any inotify watches anymore for excluded folders in the container.

FabianKramm avatar Jul 20 '21 09:07 FabianKramm

@FabianKramm Awesome, thanks, will test later today!

artushin avatar Jul 20 '21 15:07 artushin

@FabianKramm unfortunately, it does not work. Got the same behavior after adding this option

oleksandr-vynnyk avatar Jul 30 '21 15:07 oleksandr-vynnyk

Hello, We have the same issue with our Operating System Linux Ubuntu 20.04/

[error] Sync Error on /home/florian/workspace/japi: error while traversing /home/florian/xxxxxxx : no space left on device
[info] Sync stopped
[fatal] initial sync: error while traversing /home/florian/workspace/xxxxxxxxx: no space left on device

Prior the version v5.15.0, the "no space left on device" error was already there but it was not blocking the sync as the sync restarted. Now, it is blocking all the sync after the error.

We got this error with smaller project (20MB) and we have checked the inodes and disk space, everything is ok.

c1rc0le avatar Sep 09 '21 08:09 c1rc0le

I have increased this parameter fs.inotify.max_user_watches of my sysctl configuration on my Linux workstation and it works now. @oleksandr-vynnyk , have you tried this?

c1rc0le avatar Sep 09 '21 15:09 c1rc0le

facing this issue with 5.18.4 we didn't have this issue in versions prior to 5.18

hariprasadiit avatar Mar 22 '22 08:03 hariprasadiit

I have increased this parameter fs.inotify.max_user_watches of my sysctl configuration on my Linux workstation and it works now. @oleksandr-vynnyk , have you tried this?

Thank you very much. It works.

christos-P avatar Sep 05 '23 12:09 christos-P

FYI, to expand on what @christos-P mentioned:

We hit this recently as well and were able to work around it thus:

Check the local machine's limit for watches:

cat /proc/sys/fs/inotify/max_user_watches`

Raise the local machine's limit to something absurd:

sudo sh -c "echo fs.inotify.max_user_watches=524288 >> /etc/sysctl.conf"
sudo sysctl -p

The error is very confusing because:

  • It actually has nothing to do with disk space at all, just the C errorno got reused.
  • It is about the local Linux machine (i.e. your laptop) and not about anything inside the Kube cluster you're trying to connect to.
  • It doesn't happen for folks on Mac laptops, as there's a different file monitoring method used there

dragonpaw avatar Sep 05 '23 12:09 dragonpaw