`--label` causes cache invalidation in later stages
Actual behavior
A change in a --label value, causes cache invalidation for stages after the first one.
Expected behavior Caches are used as expected for a RUN command all stages, if there has been no changes to it.
To Reproduce Steps to reproduce the behavior:
- build the following Dockerfile with kaniko - set the following flags:
--cache --target stage-2 --label somelabel=1
FROM ubuntu:22.04 AS base
RUN groupadd -g 1001 testie1
FROM base as stage-2
RUN groupadd -g 1002 testie2
`docker run -v "/tmp/config.json:/kaniko/.docker/config.json" -v $PWD:/workspace2 gcr.io/kaniko-project/executor:debug --dockerfile /workspace2/docker/ubuntu-base-images/Dockerfile --no-push --context /workspace2/ --cache --cache-repo gitlab.catalyst.net.nz:4567/eugene/docker-images/kaniko-testing-cache --target stage-2 --label somelabel=6 2. run step 1 again and observe the cache being used for both stages:
INFO[0000] Resolved base name ubuntu:22.04 to base
INFO[0000] Resolved base name base to stage-2
INFO[0000] Using dockerignore file: /workspace2/.dockerignore
INFO[0000] Retrieving image manifest ubuntu:22.04
INFO[0000] Retrieving image ubuntu:22.04 from registry index.docker.io
INFO[0002] Retrieving image manifest ubuntu:22.04
INFO[0002] Returning cached image manifest
INFO[0003] Built cross stage deps: map[]
INFO[0003] Retrieving image manifest ubuntu:22.04
INFO[0003] Returning cached image manifest
INFO[0003] Retrieving image manifest ubuntu:22.04
INFO[0003] Returning cached image manifest
INFO[0003] Executing 0 build triggers
INFO[0003] Building stage 'ubuntu:22.04' [idx: '0', base-idx: '-1']
INFO[0003] Checking for cached layer my.repo/kaniko-testing-cache:199088e7ceaea1f32dadea11e45b9a467f17625e3de9965e897b4056fd45a47f...
INFO[0003] Using caching version of cmd: RUN groupadd -g 1001 testie1
INFO[0003] Skipping unpacking as no commands require it.
INFO[0003] RUN groupadd -g 1001 testie1
INFO[0003] Found cached layer, extracting to filesystem
INFO[0003] Storing source image from stage 0 at path /kaniko/stages/0
INFO[0008] Deleting filesystem...
INFO[0008] Base image from previous stage 0 found, using saved tar at path /kaniko/stages/0
INFO[0008] Executing 0 build triggers
INFO[0008] Building stage 'base' [idx: '1', base-idx: '0']
INFO[0008] Checking for cached layer my.repo/kaniko-testing-cache:6e4d66b4f2793661198d4c386523cb3898018f5d68c1401938cd6834fcc26e1a...
INFO[0008] Using caching version of cmd: RUN groupadd -g 1002 testie2
INFO[0008] Skipping unpacking as no commands require it.
INFO[0008] RUN groupadd -g 1002 testie2
INFO[0008] Found cached layer, extracting to filesystem
INFO[0008] Skipping push to container registry due to --no-push flag
- change to
--label somelabel=2, then run the build again. You'll note the cache not being used for theRUNin the second stage:
INFO[0000] Resolved base name ubuntu:22.04 to base
INFO[0000] Resolved base name base to stage-2
INFO[0000] Using dockerignore file: /workspace2/.dockerignore
INFO[0000] Retrieving image manifest ubuntu:22.04
INFO[0000] Retrieving image ubuntu:22.04 from registry index.docker.io
INFO[0003] Retrieving image manifest ubuntu:22.04
INFO[0003] Returning cached image manifest
INFO[0003] Built cross stage deps: map[]
INFO[0003] Retrieving image manifest ubuntu:22.04
INFO[0003] Returning cached image manifest
INFO[0003] Retrieving image manifest ubuntu:22.04
INFO[0003] Returning cached image manifest
INFO[0003] Executing 0 build triggers
INFO[0003] Building stage 'ubuntu:22.04' [idx: '0', base-idx: '-1']
INFO[0003] Checking for cached layer my.repo/kaniko-testing-cache:199088e7ceaea1f32dadea11e45b9a467f17625e3de9965e897b4056fd45a47f...
INFO[0003] Using caching version of cmd: RUN groupadd -g 1001 testie1
INFO[0003] Skipping unpacking as no commands require it.
INFO[0003] RUN groupadd -g 1001 testie1
INFO[0003] Found cached layer, extracting to filesystem
INFO[0004] Storing source image from stage 0 at path /kaniko/stages/0
INFO[0008] Deleting filesystem...
INFO[0008] Base image from previous stage 0 found, using saved tar at path /kaniko/stages/0
INFO[0008] Executing 0 build triggers
INFO[0008] Building stage 'base' [idx: '1', base-idx: '0']
INFO[0008] Checking for cached layer my.repo/kaniko-testing-cache:836b6a69360e27ebbcb45ba48cb7ebde66dce9f0b5e6ece466526f790a263d53...
INFO[0009] No cached layer found for cmd RUN groupadd -g 1002 testie2
INFO[0009] Unpacking rootfs as cmd RUN groupadd -g 1002 testie2 requires it.
INFO[0009] RUN groupadd -g 1002 testie2
INFO[0009] Initializing snapshotter ...
INFO[0009] Taking snapshot of full filesystem...
INFO[0011] Cmd: /bin/sh
INFO[0011] Args: [-c groupadd -g 1002 testie2]
INFO[0011] Running: [/bin/sh -c groupadd -g 1002 testie2]
INFO[0011] Taking snapshot of full filesystem...
INFO[0011] Pushing layer my.repo/kaniko-testing-cache:836b6a69360e27ebbcb45ba48cb7ebde66dce9f0b5e6ece466526f790a263d53 to cache now
INFO[0011] Pushing image to my.repo/kaniko-testing-cache:836b6a69360e27ebbcb45ba48cb7ebde66dce9f0b5e6ece466526f790a263d53
INFO[0012] Pushed my.repo/kaniko-testing-cache@sha256:d59ec9143f95ad41f88680ff94a5bdad2e8f13bd1b813090cd12430f97830d88
INFO[0012] Skipping push to container registry due to --no-push flag
Triage Notes for the Maintainers
| Description | Yes/No |
|---|---|
| Please check if this a new feature you are proposing |
|
| Please check if the build works in docker but not in kaniko |
|
Please check if this error is seen when you use --cache flag |
|
| Please check if your dockerfile is a multistage dockerfile |
|
I also suspect this is my issue when using LABEL in Dockerfiles - seems to miss the cache every time as the labels change on every build in my case
In our case it's even worse, we removed all the labels yet still run into the same issue, the reason, the Created timestamp.
Every time the base-image gets built, a new image with the identical layers is created, because it contains an updated timestamp. This is not tragic, but kaniko-cache is sensitive to the image-sha, not the sha of the layers. So when this new image is used as a base, the entire cache is lost, even though nothing changed in the base image.
note that we pass the base image as a build-arg:
FROM upstream/image:version AS build
ARG BUILD_IMAGE=build
FROM $BUILD_IMAGE AS production
--build-arg BUILD_IMAGE=local/build:version
but the arg does not contain the shasum, so nothing should change in the environment.
also note that I'm aware of the --reproducible flag and it indeed solves the issue as now we get the exact same sha every time, however that is unacceptable to us as our registry cleanup policy relies on the created time being set correctly. So in my eyes it's not an issue with the image we produce but rather with what goes into the cache-key on kaniko's side.
i guess the issue is here https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/executor/build.go#L301
for my case at least this could be enough? https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/executor/build.go#L116
+ sourceImageNoTimestamps, err := mutate.CreatedAt(sourceImage, v1.Time{})
+ if err != nil {
+ return nil, err
+ }
+ digest, err := sourceImageNoTimestamps.Digest()
- digest, err := sourceImage.Digest()
of course this could be extended to also get rid of labels etc.
that seems to fix the timestamp issue on my side, I think I can go ahead and open a PR, maybe also fix the labels issue, although I think that should be two separate, dependent PRs, as the maintainers might want independent judgement on whether to include labels or not.
this fixes the issues with the labels causing cache misses in my setup, @euven could you please verify?
+ cf, err := sourceImage.ConfigFile()
+ if err != nil {
+ return nil, err
+ }
+ cfg := cf.DeepCopy()
+ cfg.Created = v1.Time{}
+ cfg.Config.Labels = map[string]string{}
+ sourceImageReproducible, err := mutate.ConfigFile(sourceImage, cfg)
+ if err != nil {
+ return nil, err
+ }
+
+ digest, err := sourceImageReproducible.Digest()
- digest, err := sourceImage.Digest()