Facing build issue with failed to prepare sha error
Contributing guidelines
- [x] I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- [x] ... the documentation does not mention anything about my problem
- [x] ... there are no open or closed issues that are related to my problem
Description
ERROR: failed to solve: failed to prepare sha256:6ef31b9aac55f699e551706c154f1b66955d5e4379da9e6ffc45d5163cde3777 as xyx2mcarp3p5pksqoa7y6rv90: open /var/lib/docker/overlay2/2a1d88ce0a17b395ae852bb3c22bc4821b23f90f01c2caacc0503cc783a73fc9/.tmp-committed4135229564: no such file or directory
getting this error repeatedly on trying to build an alpine image on ubuntu based jenkins server
Expected behaviour
builds without any errors
Actual behaviour
throwing above mentioned error
Buildx version
github.com/docker/buildx v0.19.3 48d6a39
Docker info
ubuntu@ip-172-31-75-146:~$ docker info
Client: Docker Engine - Community
Version: 27.4.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.19.3
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.32.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 5
Running: 5
Paused: 0
Stopped: 0
Images: 6
Server Version: 27.4.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 88bf19b2105c8b17560993bee28a01ddc2f97182
runc version: v1.2.2-0-g7cb3632
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.8.0-1021-aws
Operating System: Ubuntu 22.04.5 LTS
OSType: linux
Architecture: aarch64
CPUs: 16
Total Memory: 30.75GiB
Name: ip-172-31-75-146
ID: 1b28ad27-2811-410d-8e4b-db85893d2f73
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Builders list
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
multiarch* docker-container
\_ multiarch0 \_ unix:///var/run/docker.sock inactive
multiplatformbuilder docker-container
\_ multiplatformbuilder0 \_ unix:///var/run/docker.sock inactive
default docker
\_ default \_ default running v0.17.3 linux/amd64, linux/arm64, linux/arm (+2), linux/ppc64le, (4 more)
Configuration
FROM jar-docker-images:alpine-node-arm64
# Add a cache buster argument
ARG CACHEBUSTER=1
# Install build and runtime dependencies
RUN apk add \
bash \
g++ \
make \
python3 \
git \
chromium \
nss \
freetype \
harfbuzz \
ca-certificates \
ttf-freefont \
lz4 \
cyrus-sasl \
openssl \
nodejs \
npm \
typescript \
&& echo "Cache buster: $CACHEBUSTER"
# Clear NPM cache to avoid potential issues
RUN npm cache clean --force
# Install Puppeteer with bundled Chromium for ARM
RUN npm install puppeteer --unsafe-perm=true --loglevel=verbose || { \
echo "Retrying Puppeteer install..."; \
npm cache clean --force && npm install puppeteer --unsafe-perm=true; \
}
# Set Puppeteer Chromium executable path
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=false
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
# Set working directory
WORKDIR /usr/src/app
# Copy application files
COPY . .
# Install application dependencies
RUN npm install
# Compile TypeScript project
RUN tsc -p tsconfig.staging.json
# Expose application port
EXPOSE 5001
# Start the application in staging mode
CMD ["npm", "run", "start:staging"]
Build logs
creating docker build and pushing to the ECR
deployment info nodejs : mantis : staging
docker build for nodejs
#0 building with "default" instance using docker driver
#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 1.32kB done
#1 DONE 0.0s
#2 [auth] sharing credentials for
#2 DONE 0.0s
#3 [internal] load metadata for /jar-docker-images:alpine-node-arm64
#3 DONE 0.1s
#4 [internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s
#5 [1/8] FROM /jar-docker-images:alpine-node-arm64@sha256:28bbfaf93681a35bfae4fe54098986f41e77998a080fa347c1e2b7f79a6a8c70
#5 CACHED
#6 [2/8] RUN apk add bash g++ make python3 git chromium nss freetype harfbuzz ca-certificates ttf-freefont lz4 cyrus-sasl openssl nodejs npm typescript && echo "Cache buster: 1"
#6 ERROR: failed to prepare sha256:6ef31b9aac55f699e551706c154f1b66955d5e4379da9e6ffc45d5163cde3777 as xyx2mcarp3p5pksqoa7y6rv90: open /var/lib/docker/overlay2/2a1d88ce0a17b395ae852bb3c22bc4821b23f90f01c2caacc0503cc783a73fc9/.tmp-committed4135229564: no such file or directory
#7 [internal] load build context
#7 transferring context: done
#7 CANCELED
------
> [2/8] RUN apk add bash g++ make python3 git chromium nss freetype harfbuzz ca-certificates ttf-freefont lz4 cyrus-sasl openssl nodejs npm typescript && echo "Cache buster: 1":
------
Dockerfile:7
--------------------
6 | # Install build and runtime dependencies
7 | >>> RUN apk add \
8 | >>> bash \
9 | >>> g++ \
10 | >>> make \
11 | >>> python3 \
12 | >>> git \
13 | >>> chromium \
14 | >>> nss \
15 | >>> freetype \
16 | >>> harfbuzz \
17 | >>> ca-certificates \
18 | >>> ttf-freefont \
19 | >>> lz4 \
20 | >>> cyrus-sasl \
21 | >>> openssl \
22 | >>> nodejs \
23 | >>> npm \
24 | >>> typescript \
25 | >>> && echo "Cache buster: $CACHEBUSTER"
26 |
--------------------
ERROR: failed to solve: failed to prepare sha256:6ef31b9aac55f699e551706c154f1b66955d5e4379da9e6ffc45d5163cde3777 as xyx2mcarp3p5pksqoa7y6rv90: open /var/lib/docker/overlay2/2a1d88ce0a17b395ae852bb3c22bc4821b23f90f01c2caacc0503cc783a73fc9/.tmp-committed4135229564: no such file or directory
[Pipeline] }
[Pipeline] // stage
[Pipeline] stage
[Pipeline] { (Declarative: Post Actions)
[Pipeline] cleanWs
[WS-CLEANUP] Deleting project workspace...
[WS-CLEANUP] Deferred wipeout is used...
[WS-CLEANUP] done
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
ERROR: script returned exit code 1
Finished: FAILURE
Additional info
No response
Can we get some help here, facing similar issue on my project
Facing the same issue on my project as well. This is happening repeatedly and very often. Would be great to have this resolved
I did a quick search in the buildkit, docker, and containerd codebase, and can't find a direct reference to anything creating a .tmp-committed<number> file or directory.
However, I did find a reference to creating a .tmp-<suffix> file as part of the NewAtomicFileWriter, code; https://github.com/moby/moby/blob/a72026acbbdfcf90f0ba203abd4e0943e3d546e7/pkg/ioutils/fswriters.go#L13
func NewAtomicFileWriter(filename string, perm os.FileMode) (io.WriteCloser, error) {
f, err := os.CreateTemp(filepath.Dir(filename), ".tmp-"+filepath.Base(filename))
if err != nil {
return nil, err
}
But it's not directly clear if the error originates from buildkit (or docker) code, or if it's an error produced by the code running inside the container (in this case apk add), which may be creating a temp file (and possibly there some race-condition, or it not being working well combined with overlayFS).
The example provided unfortunately looks to depend on a custom base image (jar-docker-images:alpine-node-arm64);
- if possible, can you provide a minimal-as-possible reproducer based on a public image?
- does this happen with the default (builtin) builder as well, or only with a custom (container-driver) builder?
Hi,
Can replicate this with this docker base image as well node:22-alpine
Possibility of the apk add causing this as well, will dig more on that.
No way to conclusively debug and solve this, unless the docker error is more precise
I'm currently experiencing the same on one of our docker hosts.
Seems like it has nothing to do with apk or any tool, as our Dockerfile looks like this:
ARG REGISTRY=our-registry.com
FROM ${REGISTRY}/base-image:tag
COPY dependencies.txt /opt/app/etc/dependencies.txt
ADD build/results/ /opt/app/
RUN yum makecache && \
yum makecache && yum install -y $(cat /opt/app/etc/dependencies.txt) && \
yum clean all && rm -rf /var/yum/cache
...
And the build currently fails on step 3/4 (ADD)
=> ERROR [3/4] ADD build/results/ /opt/app/ 0.0s
------
> [3/4] ADD build/results/ /opt/app/:
------
Dockerfile:5
--------------------
3 |
4 | COPY dependencies.txt /opt/app/etc/dependencies.txt
5 | >>> ADD build/results/ /opt/app/
6 |
7 | RUN yum makecache && \
--------------------
ERROR: failed to solve: failed to prepare sha256:d555d0b4787b48abff53b195b1e633605bfccd4375307f83dfaca71726296503 as 0fnvnlk45jk2voxro4ryvfq57: open /var/lib/docker/overlay2/uvd5vsvtv8o4ryeycd4o22i2q/.tmp-committed660597121: no such file or directory
Stracing the Docker daemon shows that the file is tried to be opened, but obviously the directory does not exist, as the O_CREAT flag is set:
openat(AT_FDCWD</>, "/var/lib/docker/overlay2/uvd5vsvtv8o4ryeycd4o22i2q/.tmp-committed3194320328", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0600) = -1 ENOENT (No such file or directory)
Also I could not find any reference to a directory creation for /var/lib/docker/overlay2/uvd5vsvtv8o4ryeycd4o22i2q prior to the open().
I also noticed this write() were something like a stacktrace is written to a file by another thread:
write(49</var/lib/docker/buildkit/content/ingest/d8c3a3cf71a170d58f5b4a8d87984e302e6bf0f5d8e5ae0bfd99f28874098c3c/data>, "\10\2\22\343\1failed to prepare sha256:d555d0b4787b48abff53b195b1e633605bfccd4375307f83dfaca71726296503 as 8p4l8oi8g3h7njtgtjlwf99m8: open /var/lib/docker/overlay2/uvd5vsvtv8o4ryeycd4o22i2q/.tmp-committed3194320328: no such file or directory\32\344\7\n)github.com/moby/buildkit/stack.Stack+json\22\266\7{\"frames\":[{\"Name\":\"github.com/moby/buildkit/cache.(*cacheManager).New\",\"File\":\"/root/rpmbuild/BUILD/src/engine/vendor/github.com/moby/buildkit/cache/manager.go\",\"Line\":628},{\"Name\":\"github.com/moby/buildkit/solver/llbsolver/file.(*RefManager).Prepare\",\"File\":\"/root/rpmbuild/BUILD/src/engine/vendor/github.com/moby/buildkit/solver/llbsolver/file/refmanager.go\",\"Line\":43},{\"Name\":\"github.com/moby/buildkit/solver/llbsolver/ops.(*FileOpSolver).getInput.func1.(*FileOpSolver).getInput.func1.2.6\",\"File\":\"/root/rpmbuild/BUILD/src/engine/vendor/github.com/moby/buildkit/solver/llbsolver/ops/file.go\",\"Line\":475},{\"Name\":\"golang.org/x/sync/errgroup.(*Group).Go.func1\",\"File\":\"/root/rpmbuild/BUILD/src/engine/vendor/golang.o"..., 7032) = 7032
So it seems like it has something to do with the cache.
When using the same docker build commandline, which fails repeatedly and adding --no-cache argument, the build succeeds.
So seems like there is a race condition between writing the cache and the creation of the filesystem layer?
Facing the same issue. In our case, re-running the build magically fixes it. Haven't had much success with anything else so far.
We managed to fix the issue by creating and using a docker-container driver based builder rather than the default docker driver.
Getting same issue. Suspected it was due to hosting/availability of a package dependency I was trying to pull down from the web. Upgraded versions and was able to build a new container. 4 Days later even that won't build now. Pruned everything Docker, running latest software versions.