buildctl-daemonless.sh - could not connect to buildkitd.sock
Hello! In our pipelines sometimes (10-20% from all builds) we get the next error:
could not connect to unix:///run/user/1000/buildkit/buildkitd.sock after 10 trials
What does it mean, and how to fix it?
We use deamonless rooless builldkit.
We run following command:
buildctl-daemonless.sh build
--frontend=dockerfile.v0
--local context=.
--local dockerfile=.
--opt filename=./Dockerfile
--import-cache type=registry,ref=$IMPORT_CACHE
--output type=image,oci-mediatypes=true,force-compression=true,\"name=$IMAGE\",push=true,unpack=false,store=false
--opt target=builder
Thank you!
Please provide the logs
https://github.com/moby/buildkit/blob/68059406655653f5df4b35f5a4728d673920f166/examples/buildctl-daemonless/buildctl-daemonless.sh#L49
I cheked logs in /tmp folder, and they wasn't exist. After that I removed ; rm -rf $tmp from buildctl-daemonless.sh and started waiting an error again. When it happend again (few times), I founded next info in log:
time="2022-08-16T16:25:13Z" level=info msg="found worker \"ba7r3qaq508mnk1u8tu1kvb9j\", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:runner-zmnvr4gq-project-604-concurrent-0 org.mobyproject.buildkit.worker.network:host org.mobyproject.buildkit.worker.oci.process-mode:sandbox org.mobyproject.buildkit.worker.snapshotter:overlayfs], platforms=[linux/amd64 linux/amd64/v2 linux/amd64/v3 linux/386]"
time="2022-08-16T16:25:13Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
time="2022-08-16T16:25:13Z" level=info msg="found 1 workers, default=\"ba7r3qaq508mnk1u8tu1kvb9j\""
time="2022-08-16T16:25:13Z" level=warning msg="currently, only the default worker can be used."
time="2022-08-16T16:25:13Z" level=info msg="stopping server"
buildkitd: context canceled
[rootlesskit:child ] error: command [buildkitd --addr=unix:///run/user/1000/buildkit/buildkitd.sock] exited: exit status 1
[rootlesskit:parent] error: child exited: exit status 1
There is log for normal building:
time="2022-08-16T16:16:43Z" level=info msg="auto snapshotter: using overlayfs"
time="2022-08-16T16:16:44Z" level=info msg="found worker \"ba7r3qaq508mnk1u8tu1kvb9j\", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:runner-zmnvr4gq-project-604-concurrent-0 org.mobyproject.buildkit.worker.network:host org.mobyproject.buildkit.worker.oci.process-mode:sandbox org.mobyproject.buildkit.worker.snapshotter:overlayfs], platforms=[linux/amd64 linux/amd64/v2 linux/amd64/v3 linux/386]"
time="2022-08-16T16:16:44Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
time="2022-08-16T16:16:44Z" level=info msg="found 1 workers, default=\"ba7r3qaq508mnk1u8tu1kvb9j\""
time="2022-08-16T16:16:44Z" level=warning msg="currently, only the default worker can be used."
time="2022-08-16T16:16:44Z" level=info msg="running server on /run/user/1000/buildkit/buildkitd.sock"
time="2022-08-16T16:19:04Z" level=info msg="stopping server"
buildkitd: context canceled
[rootlesskit:child ] error: command [buildkitd --addr=unix:///run/user/1000/buildkit/buildkitd.sock] exited: exit status 1
[rootlesskit:parent] error: child exited: exit status 1
The difference only in this line:
time="2022-08-16T16:16:44Z" level=info msg="running server on /run/user/1000/buildkit/buildkitd.sock"
This line doesn't exist in logs for failed builds.
I also encountered a similar problem. Setting the BUILDCTL_CONNECT_RETRIES_MAX of daemonless to 20 can reduce the failure rate, but the problem still occurs.