buildx icon indicating copy to clipboard operation
buildx copied to clipboard

Non-deterministic context "not found"

Open devinrsmith opened this issue 3 years ago • 1 comments

I came across an issue in GH CI:

error: failed to solve: failed to compute cache key: failed to calculate checksum of ref 6h5t7qkfv7q6htvvsqs6sw4wd::n2jajo70gwxthi6avdwzqnrrv: "/groovy": not found

The bake executes a bunch of targets, but the target that failed was pretty simple:

target "groovy-config" {
    context = "deephaven-app/"
    target = "groovy-config"
}
FROM scratch as groovy-config
COPY --link groovy/ /opt/deephaven/config/

This task has been running successfully for a dozen builds, and re-running the failed GH job a second time "fixed" the issue.

I'm not sure where exactly the issue lies - maybe there is a potential issue w/ the caching layer (type=gha)? I can attach the full GH CI logs if helpful.

Edit (some version info):

2022-07-21T15:10:05.6753675Z ##[group]Run docker/[email protected]
2022-07-21T15:10:05.6753908Z with:
2022-07-21T15:10:05.6754082Z   targets: release
2022-07-21T15:10:05.6754270Z   pull: true
2022-07-21T15:10:05.6754432Z   push: true
2022-07-21T15:10:05.6754611Z   workdir: .
2022-07-21T15:10:05.6754790Z   no-cache: false
2022-07-21T15:10:05.6754957Z   load: false
2022-07-21T15:10:05.6755133Z env:
2022-07-21T15:10:05.6755336Z   REPO_PREFIX: ghcr.io/devinrsmith/
2022-07-21T15:10:05.6755539Z ##[endgroup]
2022-07-21T15:10:05.8182879Z ##[group]Docker info
2022-07-21T15:10:05.8190747Z [command]/usr/bin/docker version
2022-07-21T15:10:05.8522316Z Client:
2022-07-21T15:10:05.8522939Z  Version:           20.10.17+azure-1
2022-07-21T15:10:05.8523274Z  API version:       1.41
2022-07-21T15:10:05.8523800Z  Go version:        go1.17.11
2022-07-21T15:10:05.8524360Z  Git commit:        100c70180fde3601def79a59cc3e996aa553c9b9
2022-07-21T15:10:05.8524825Z  Built:             Mon Jun  6 21:36:39 UTC 2022
2022-07-21T15:10:05.8525301Z  OS/Arch:           linux/amd64
2022-07-21T15:10:05.8525793Z  Context:           default
2022-07-21T15:10:05.8526299Z  Experimental:      true
2022-07-21T15:10:05.8526706Z 
2022-07-21T15:10:05.8526882Z Server:
2022-07-21T15:10:05.8527050Z  Engine:
2022-07-21T15:10:05.8527283Z   Version:          20.10.17+azure-1
2022-07-21T15:10:05.8527617Z   API version:      1.41 (minimum version 1.12)
2022-07-21T15:10:05.8528081Z   Go version:       go1.17.11
2022-07-21T15:10:05.8528620Z   Git commit:       a89b84221c8560e7a3dee2a653353429e7628424
2022-07-21T15:10:05.8529080Z   Built:            Mon Jun  6 22:32:38 2022
2022-07-21T15:10:05.8529557Z   OS/Arch:          linux/amd64
2022-07-21T15:10:05.8530052Z   Experimental:     false
2022-07-21T15:10:05.8530539Z  containerd:
2022-07-21T15:10:05.8531084Z   Version:          1.5.13+azure-1
2022-07-21T15:10:05.8531595Z   GitCommit:        a17ec496a95e55601607ca50828147e8ccaeebf1
2022-07-21T15:10:05.8532156Z  runc:
2022-07-21T15:10:05.8532670Z   Version:          1.0.3
2022-07-21T15:10:05.8533233Z   GitCommit:        f46b6ba2c9314cfc8caae24a32ec5fe9ef1059fe
2022-07-21T15:10:05.8533685Z  docker-init:
2022-07-21T15:10:05.8534171Z   Version:          0.19.0
2022-07-21T15:10:05.8534671Z   GitCommit:        
2022-07-21T15:10:05.8572483Z [command]/usr/bin/docker info
2022-07-21T15:10:05.9308659Z Client:
2022-07-21T15:10:05.9314455Z  Context:    default
2022-07-21T15:10:05.9317707Z  Debug Mode: false
2022-07-21T15:10:05.9318058Z  Plugins:
2022-07-21T15:10:05.9323259Z   buildx: Docker Buildx (Docker Inc., 0.8.2+azure-1)
2022-07-21T15:10:05.9326260Z   compose: Docker Compose (Docker Inc., 2.6.1+azure-1)
2022-07-21T15:10:05.9326542Z 
2022-07-21T15:10:05.9326736Z Server:
2022-07-21T15:10:05.9326908Z  Containers: 1
2022-07-21T15:10:05.9327092Z   Running: 1
2022-07-21T15:10:05.9327329Z   Paused: 0
2022-07-21T15:10:05.9335890Z   Stopped: 0
2022-07-21T15:10:05.9336072Z  Images: 15
2022-07-21T15:10:05.9336525Z  Server Version: 20.10.17+azure-1
2022-07-21T15:10:05.9336775Z  Storage Driver: overlay2
2022-07-21T15:10:05.9337065Z   Backing Filesystem: extfs
2022-07-21T15:10:05.9337677Z   Supports d_type: true
2022-07-21T15:10:05.9338083Z   Native Overlay Diff: false
2022-07-21T15:10:05.9338315Z   userxattr: false
2022-07-21T15:10:05.9338642Z  Logging Driver: json-file
2022-07-21T15:10:05.9338931Z  Cgroup Driver: cgroupfs
2022-07-21T15:10:05.9339153Z  Cgroup Version: 2
2022-07-21T15:10:05.9339348Z  Plugins:
2022-07-21T15:10:05.9339594Z   Volume: local
2022-07-21T15:10:05.9339841Z   Network: bridge host ipvlan macvlan null overlay
2022-07-21T15:10:05.9342070Z   Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
2022-07-21T15:10:05.9342468Z  Swarm: inactive
2022-07-21T15:10:05.9342789Z  Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
2022-07-21T15:10:05.9343149Z  Default Runtime: runc
2022-07-21T15:10:05.9343409Z  Init Binary: docker-init
2022-07-21T15:10:05.9343757Z  containerd version: a17ec496a95e55601607ca50828147e8ccaeebf1
2022-07-21T15:10:05.9344137Z  runc version: f46b6ba2c9314cfc8caae24a32ec5fe9ef1059fe
2022-07-21T15:10:05.9344676Z  init version: 
2022-07-21T15:10:05.9344918Z  Security Options:
2022-07-21T15:10:05.9345101Z   apparmor
2022-07-21T15:10:05.9345271Z   seccomp
2022-07-21T15:10:05.9345443Z    Profile: default
2022-07-21T15:10:05.9345630Z   cgroupns
2022-07-21T15:10:05.9345884Z  Kernel Version: 5.15.0-1014-azure
2022-07-21T15:10:05.9346110Z  Operating System: Ubuntu 22.04 LTS
2022-07-21T15:10:05.9346329Z  OSType: linux
2022-07-21T15:10:05.9346528Z  Architecture: x86_64
2022-07-21T15:10:05.9346797Z  CPUs: 2
2022-07-21T15:10:05.9347002Z  Total Memory: 6.781GiB
2022-07-21T15:10:05.9347312Z  Name: fv-az249-523
2022-07-21T15:10:05.9347644Z  ID: OIGB:ODNV:4BZ6:HM75:MCCQ:RAIS:5PDB:PSM7:W3QY:EVUA:7PJ6:SDEY
2022-07-21T15:10:05.9347989Z  Docker Root Dir: /var/lib/docker
2022-07-21T15:10:05.9348270Z  Debug Mode: false
2022-07-21T15:10:05.9348489Z  Username: githubactions
2022-07-21T15:10:05.9348806Z  Registry: https://index.docker.io/v1/
2022-07-21T15:10:05.9349105Z  Labels:
2022-07-21T15:10:05.9349319Z  Experimental: false
2022-07-21T15:10:05.9349519Z  Insecure Registries:
2022-07-21T15:10:05.9349704Z   127.0.0.0/8
2022-07-21T15:10:05.9349899Z  Live Restore Enabled: false
2022-07-21T15:10:05.9350114Z 
2022-07-21T15:10:05.9350703Z ##[endgroup]
2022-07-21T15:10:06.0943463Z ##[group]Buildx version
2022-07-21T15:10:06.0960805Z [command]/usr/bin/docker buildx version
2022-07-21T15:10:06.1624216Z github.com/docker/buildx 0.8.2+azure-1 6224def4dd2c3d347eee19db595348c50d7cb491
2022-07-21T15:10:06.1678217Z ##[endgroup]

devinrsmith avatar Jul 21 '22 16:07 devinrsmith

Hard to say what may be going on based on this info. Seems that a lot of things are happening together, and I don't really even understand where that groovy directory is supposed to come from.

Maybe you can put together a more limited reproducer that we could test with. If it requires multiple builds to trigger the case, then that is fine.

#13 [groovy-config internal] load build context
#13 transferring context: 210B done
#13 DONE 0.0s

Is that 210B enough to transfer the contents of that directory?

tonistiigi avatar Aug 02 '22 00:08 tonistiigi

Sorry for the delay - I think the 210B is enough, there was likely only a small configuration file in that directory.

It may be a case where the GH caching layer is "misbehaving" in some fashion. Maybe it says "yes, I have this cache", and then later when the build tries to get the cache it fails to actually fetch. Alternatively, maybe there is some bad state between cache and bake configuration if there are different cache scopes specified, but they all depend on a common intermediate.

I've across this again in another bake workflow of mine, very similar in construction:

Run #1

#21 [2/2] COPY --link pack-plugins.sh .
#21 ERROR: failed to calculate checksum of ref s2916nc9ghwbrnjbhjp65csdt::xe7vknejdjgcqhj5x2r9pec9w: "/pack-plugins.sh": not found

Run #2

#39 [web-plugin-packager-release] exporting cache
#39 preparing build cache for export 0.1s done
#39 ERROR: not found

This second run was me re-invoking the job right after the first failure. The second run errored out a bit differently, so potentially provides additional piece of detective info.

I suspect it's going to be hard to reproduce on demand. All the cases I've seen require at least a day between "success" and later "failure".

   /usr/bin/docker buildx version
  github.com/docker/buildx 0.9.1+azure-2 ed00243a0ce2a0aee75311b06e32d33b44729689

devinrsmith avatar Nov 07 '22 16:11 devinrsmith

Here's the sort of construction I've been using, simplified:

dependencies-simplified

Maybe there is something fundamentally wrong w/ my approach to use separately named caches?

devinrsmith avatar Nov 07 '22 16:11 devinrsmith