buildkit icon indicating copy to clipboard operation
buildkit copied to clipboard

Cache pushed from one machine can not be reused on another machine

Open kindermax opened this issue 6 years ago • 16 comments

Hi. Currently, I am trying to set up an infrastructure for reusing caches for local (dev machine) builds.

Here are the steps:

  1. Build an image using docker buildx build in CI (Gitlab CI) - Docker Engine version 19.03
  2. Push image from CI to Gitlab Registry
docker buildx build . \
      -t registry.my-company-gitlab.com/app:latest \
      -f ./docker/Dockerfile.$IMAGE \
      --cache-from=type=registry,ref=registry.my-company-gitlab.com/app:latest \
      --cache-to=type=registry,ref=registry.my-company-gitlab.com/app:latest,mode=max \
      --push 

Dockerfiles are with multistage builds.

  1. Then I run build on my laptop and expecting to reuse cache from registry image (built and pushed from 1 and 2 steps)
docker buildx build -t my-local-image -f Dockerfile.app --cache-from=type=registry,ref=registry.my-company-gitlab.com/app:latest --load .

Before any builds, I run

docker builder prune
docker system prune -a
  1. But the new build it not reusing any of cache and starts building from scratch

kindermax avatar Nov 11 '19 11:11 kindermax

Please post a reproducible testcase that we could run to figure this out.

One thing I noticed is that you are using the same reference for cache and your image, so unless this is just a mistake on the report this is definitely wrong. Also, until recently github registry didn't support manifest lists that are used in the external cache format and mulit-platform image, so I'm surprised you made that far.

tonistiigi avatar Nov 11 '19 17:11 tonistiigi

Hi, thank you for the quick response.

I've set up a test project to reproduce the cache issue.

https://gitlab.com/kindritskiy.m/docker-cache-issue

It is not a multistage build. It is a GitLab CI (not Github one)

you are using the same reference for cache and your image

I am not sure I understand what you mean. Do you mean this two lines

--cache-from=type=registry,ref=registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest \
      --cache-to=type=registry,ref=registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest,mode=max \

or this

-t registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest
  1. the job, which builds and pushes image - https://gitlab.com/kindritskiy.m/docker-cache-issue/-/jobs/348012533
  2. I am trying to build an image locally with the cache
docker buildx build -t my-local-image -f Dockerfile --cache-from=type=registry,ref=registry,ref=registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest --load .
[+] Building 28.2s (11/11) FINISHED                                                                     
 => importing cache manifest from registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest      3.4s
 => [internal] load .dockerignore                                                                  0.1s
 => => transferring context: 2B                                                                    0.0s
 => [internal] load build definition from Dockerfile                                               0.1s
 => => transferring dockerfile: 127B                                                               0.0s
 => [internal] load metadata for docker.io/library/node:12-alpine                                  1.7s
 => [internal] load build context                                                                  0.1s
 => => transferring context: 462B                                                                  0.0s
 => [1/5] FROM docker.io/library/node:12-alpine@sha256:50ce309a948aaad30ee876fb07ccf35b62833b27de  0.0s
 => CACHED [2/5] WORKDIR /app                                                                     20.8s
 => => pulling sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10             2.7s
 => => pulling sha256:7b373bfb6ac5ffc0602bd1033666f9138fc137d68f67c4d4726cd6ac0c6bc9ac            18.9s
 => => pulling sha256:fd38342e03373b2ef6a3c7f354adf8d165454628b1de8c8329e70b3ef6325710             1.3s
 => => pulling sha256:5269cc77d334b68485968973d8e40df41f5d712a0cc66580bf9d925e5da6b923             1.4s
 => => pulling sha256:eafd5e4882da62f3d333317fc4fa6322755c84c6cb978dc3b962180455760a98             0.6s
 => [3/5] COPY package.json .                                                                      0.9s
 => [4/5] COPY requirements.txt .                                                                  0.1s
 => [5/5] RUN npm i                                                                                2.8s
 => exporting to image                                                                             0.2s
 => => exporting layers                                                                            0.1s
 => => writing image sha256:32fbc089de4f0b350a0362b5866345f019032d2b69e8e29092f9ce1639f02c34       0.0s 

On CI I use https://github.com/docker/buildx/releases/download/v0.3.1/buildx-v0.3.1.linux-amd64 binary. On my laptop I already have a bundled version of buildx - same on CI

docker buildx version
github.com/docker/buildx v0.3.1 6db68d029599c6710a32aa7adcba8e5a344795a7 

kindermax avatar Nov 12 '19 09:11 kindermax

Do you mean this two lines

You can't use the same ref on --cache-* and -t because they are different objects and pushed separately. (The exception here would be inline cache that would not push a separate object but append metatdata to image config).

tonistiigi avatar Nov 12 '19 17:11 tonistiigi

I can reproduce your case but if I push my own image with docker buildx build --cache-to type=registry,ref=tonistiigi/build-cache-issue:latest,mode=max . then now running docker buildx build --cache-from tonistiigi/build-cache-issue:latest . seems to work fine for whole build.

tonistiigi avatar Nov 12 '19 22:11 tonistiigi

Thank you, I now know that tag and cache cant be same. I did no find any docs about that so its good to know it from you. Will try fix my builds. But one question is still bothering me - if i build and push image from my laptop to ci then i can reuse all the cache. It really works. But what about building on one machine (lets say ci server) and using that cache on another machine? Will it wotk?

kindermax avatar Nov 12 '19 22:11 kindermax

@kindritskyiMax Yes, it should work if you switch machines. Did you try if my cache works for you? So are you saying that --cache-to works for you (even when switching machine) but does not work if you export from a specific machine?

tonistiigi avatar Nov 12 '19 22:11 tonistiigi

If image was build and pushed fro my machine and used only on my machine, then cache works. But when using different machines - cache not works

kindermax avatar Nov 12 '19 22:11 kindermax

I will try your image when will be at my laptop. Will let you know. Thank you.

kindermax avatar Nov 12 '19 22:11 kindermax

@kindritskyiMax And that is even if you remove your local cache to make sure remote cache is used? In https://gitlab.com/kindritskiy.m/docker-cache-issue/-/jobs/348012533 I also see that the remote cache was used, so is it that cache exported in ci only works when importing to ci machines(with fresh state).

Or does it have something to do with exporting cache that has already been imported like it happens in https://gitlab.com/kindritskiy.m/docker-cache-issue/-/jobs/348012533 ?

tonistiigi avatar Nov 12 '19 22:11 tonistiigi

And that is even if you remove your local cache to make sure remote cache is used?

Yes, I am removing everything to be sure I use only remote cache.

I have noticed as well that ci servers can reuse cache built on ci servers. But my laptop can not reuse that cache.

kindermax avatar Nov 12 '19 22:11 kindermax

@tonistiigi I've tried build image from your cache and it didn't work.

docker buildx build . -t my-from-cache -f Dockerfile --cache-from tonistiigi/build-cache-issue:latest
[+] Building 10.9s (11/11) FINISHED                                                                     
 => importing cache manifest from tonistiigi/build-cache-issue:latest                              2.7s
 => [internal] load .dockerignore                                                                  0.1s
 => => transferring context: 2B                                                                    0.0s
 => [internal] load build definition from Dockerfile                                               0.1s
 => => transferring dockerfile: 127B                                                               0.0s
 => [internal] load metadata for docker.io/library/node:12-alpine                                  2.0s
 => [1/5] FROM docker.io/library/node:12-alpine@sha256:50ce309a948aaad30ee876fb07ccf35b62833b27de  0.0s
 => [internal] load build context                                                                  0.1s
 => => transferring context: 462B                                                                  0.0s
 => CACHED [2/5] WORKDIR /app                                                                      4.8s
 => => pulling sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10             0.9s
 => => pulling sha256:7b373bfb6ac5ffc0602bd1033666f9138fc137d68f67c4d4726cd6ac0c6bc9ac             3.4s
 => => pulling sha256:fd38342e03373b2ef6a3c7f354adf8d165454628b1de8c8329e70b3ef6325710             0.8s
 => => pulling sha256:5269cc77d334b68485968973d8e40df41f5d712a0cc66580bf9d925e5da6b923             0.2s
 => => pulling sha256:841db4349b44a385cef1a07272aa772b5c7dd4ec84638734f9b9190a82d07125             0.3s
 => [3/5] COPY package.json .                                                                      0.8s
 => [4/5] COPY requirements.txt .                                                                  0.1s
 => [5/5] RUN npm i                                                                                2.2s
 => exporting to image                                                                             0.1s
 => => exporting layers                                                                            0.1s
 => => writing image sha256:3f4e7189d3ac2931b0d9b72bd8b802c1833cfdd33fc8e6c8d57c458faeea3c0d       0.0s 

kindermax avatar Nov 13 '19 07:11 kindermax

I've updated build command on CI. https://gitlab.com/kindritskiy.m/docker-cache-issue/blob/master/build_cache.sh#L31

Here is the new job - https://gitlab.evo.dev/m.kindritskiy/docker-cache/-/jobs/3160699. Still, I can not reuse cache on my laptop by building with the command (buildx is committed to repo)

./docker-buildx build -f Dockerfile --cache-from registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest --load .
#####
[+] Building 11.8s (11/11) FINISHED                                                                                                                                                                                                  
 => importing cache manifest from registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest                                                                                                                                   3.5s
 => [internal] load .dockerignore                                                                                                                                                                                               0.1s
 => => transferring context: 2B                                                                                                                                                                                                 0.0s
 => [internal] load build definition from Dockerfile                                                                                                                                                                            0.1s
 => => transferring dockerfile: 127B                                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/node:12-alpine                                                                                                                                                               1.3s
 => [1/5] FROM docker.io/library/node:12-alpine@sha256:50ce309a948aaad30ee876fb07ccf35b62833b27de4d3a818295982efb04ce6b                                                                                                         0.0s
 => [internal] load build context                                                                                                                                                                                               0.1s
 => => transferring context: 462B                                                                                                                                                                                               0.0s
 => CACHED [2/5] WORKDIR /app                                                                                                                                                                                                   4.9s
 => => pulling sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10                                                                                                                                          1.1s
 => => pulling sha256:7b373bfb6ac5ffc0602bd1033666f9138fc137d68f67c4d4726cd6ac0c6bc9ac                                                                                                                                          3.0s
 => => pulling sha256:fd38342e03373b2ef6a3c7f354adf8d165454628b1de8c8329e70b3ef6325710                                                                                                                                          1.4s
 => => pulling sha256:5269cc77d334b68485968973d8e40df41f5d712a0cc66580bf9d925e5da6b923                                                                                                                                          0.5s
 => => pulling sha256:1cb335f2ff4b7a9042d20028e41736ea2ccd5e5961d308cd1f780ff65b2c1956                                                                                                                                          0.8s
 => [3/5] COPY package.json .                                                                                                                                                                                                   1.2s
 => [4/5] COPY requirements.txt .                                                                                                                                                                                               0.1s
 => [5/5] RUN npm i                                                                                                                                                                                                             1.8s
 => exporting to image                                                                                                                                                                                                          0.1s
 => => exporting layers                                                                                                                                                                                                         0.1s
 => => writing image sha256:2fd26692991a805e65ace943a3f03ac029f6111e4dc62329cb6c3a5b5006f729  

kindermax avatar Nov 13 '19 09:11 kindermax

@tonistiigi Hi. Do you have any ideas/suggestions on how to fix this reusing cache thing between machines?

kindermax avatar Nov 15 '19 14:11 kindermax

I'm having the exact problem with a CI/CD pipeline on GitLab when mixing shared runners and private runners. Cache is only reused if a private runner uses --cache-from pointing to an image made from a private runner as well. The same applies to shared runners: cache is only reused if a shared runner uses --cache-from pointing to an image made from a shared runner.

@kindritskyiMax did you managed to get around this?

satazor avatar Jan 10 '20 12:01 satazor

I didn't solve exactly this problem with builtin cache reuse. But I've found another solution, not perfect but at least allows to reduce amount of building on dev machine.

Workaround.

I've created a python script that calculates the checksum of some files which are considered as dependencies Dockerfile relies on. For example, for a python project, I have requirements.txt and requirements-dev.txt.

There is some file, called checksum-deps.txt and it contains two lines with requirements.txt and requirements-dev.txt.

Using make, each time I run some command, like make run, it calculates checksum and if for that checksum I do not have an image with that tag locally, I am trying to pull myimage:<checksum>.

Also, I have a periodic job in Gitlab (like every 30 minutes), that calculates checksums, pushes images, so I almost always have an image for new checksum if some of the dependencies files has changed.

There is quite a lot of machinery but checksum misses is almost 0.

kindermax avatar Jan 11 '20 10:01 kindermax

Any news on that issue?

I'm facing the same problem and this is quite an issue for larger containers. We are trying to run a local build cache registry that should be used by the CI and multiple devs (e.g. for devcontainers).

cbauernhofer avatar May 09 '25 11:05 cbauernhofer