kaniko icon indicating copy to clipboard operation
kaniko copied to clipboard

`--label` causes cache invalidation in later stages

Open euven opened this issue 1 year ago • 6 comments

Actual behavior A change in a --label value, causes cache invalidation for stages after the first one.

Expected behavior Caches are used as expected for a RUN command all stages, if there has been no changes to it.

To Reproduce Steps to reproduce the behavior:

  1. build the following Dockerfile with kaniko - set the following flags: --cache --target stage-2 --label somelabel=1
FROM ubuntu:22.04 AS base

RUN groupadd -g 1001 testie1

FROM base as stage-2

RUN groupadd -g 1002 testie2

`docker run -v "/tmp/config.json:/kaniko/.docker/config.json" -v $PWD:/workspace2 gcr.io/kaniko-project/executor:debug --dockerfile /workspace2/docker/ubuntu-base-images/Dockerfile --no-push --context /workspace2/ --cache --cache-repo gitlab.catalyst.net.nz:4567/eugene/docker-images/kaniko-testing-cache --target stage-2 --label somelabel=6 2. run step 1 again and observe the cache being used for both stages:

INFO[0000] Resolved base name ubuntu:22.04 to base      
INFO[0000] Resolved base name base to stage-2           
INFO[0000] Using dockerignore file: /workspace2/.dockerignore 
INFO[0000] Retrieving image manifest ubuntu:22.04       
INFO[0000] Retrieving image ubuntu:22.04 from registry index.docker.io 
INFO[0002] Retrieving image manifest ubuntu:22.04       
INFO[0002] Returning cached image manifest              
INFO[0003] Built cross stage deps: map[]                
INFO[0003] Retrieving image manifest ubuntu:22.04       
INFO[0003] Returning cached image manifest              
INFO[0003] Retrieving image manifest ubuntu:22.04       
INFO[0003] Returning cached image manifest              
INFO[0003] Executing 0 build triggers                   
INFO[0003] Building stage 'ubuntu:22.04' [idx: '0', base-idx: '-1'] 
INFO[0003] Checking for cached layer my.repo/kaniko-testing-cache:199088e7ceaea1f32dadea11e45b9a467f17625e3de9965e897b4056fd45a47f... 
INFO[0003] Using caching version of cmd: RUN groupadd -g 1001 testie1 
INFO[0003] Skipping unpacking as no commands require it. 
INFO[0003] RUN groupadd -g 1001 testie1                 
INFO[0003] Found cached layer, extracting to filesystem 
INFO[0003] Storing source image from stage 0 at path /kaniko/stages/0 
INFO[0008] Deleting filesystem...                       
INFO[0008] Base image from previous stage 0 found, using saved tar at path /kaniko/stages/0 
INFO[0008] Executing 0 build triggers                   
INFO[0008] Building stage 'base' [idx: '1', base-idx: '0'] 
INFO[0008] Checking for cached layer my.repo/kaniko-testing-cache:6e4d66b4f2793661198d4c386523cb3898018f5d68c1401938cd6834fcc26e1a... 
INFO[0008] Using caching version of cmd: RUN groupadd -g 1002 testie2 
INFO[0008] Skipping unpacking as no commands require it. 
INFO[0008] RUN groupadd -g 1002 testie2                 
INFO[0008] Found cached layer, extracting to filesystem 
INFO[0008] Skipping push to container registry due to --no-push flag 
  1. change to --label somelabel=2, then run the build again. You'll note the cache not being used for the RUN in the second stage:
INFO[0000] Resolved base name ubuntu:22.04 to base      
INFO[0000] Resolved base name base to stage-2           
INFO[0000] Using dockerignore file: /workspace2/.dockerignore 
INFO[0000] Retrieving image manifest ubuntu:22.04       
INFO[0000] Retrieving image ubuntu:22.04 from registry index.docker.io 
INFO[0003] Retrieving image manifest ubuntu:22.04       
INFO[0003] Returning cached image manifest              
INFO[0003] Built cross stage deps: map[]                
INFO[0003] Retrieving image manifest ubuntu:22.04       
INFO[0003] Returning cached image manifest              
INFO[0003] Retrieving image manifest ubuntu:22.04       
INFO[0003] Returning cached image manifest              
INFO[0003] Executing 0 build triggers                   
INFO[0003] Building stage 'ubuntu:22.04' [idx: '0', base-idx: '-1'] 
INFO[0003] Checking for cached layer my.repo/kaniko-testing-cache:199088e7ceaea1f32dadea11e45b9a467f17625e3de9965e897b4056fd45a47f... 
INFO[0003] Using caching version of cmd: RUN groupadd -g 1001 testie1 
INFO[0003] Skipping unpacking as no commands require it. 
INFO[0003] RUN groupadd -g 1001 testie1                 
INFO[0003] Found cached layer, extracting to filesystem 
INFO[0004] Storing source image from stage 0 at path /kaniko/stages/0 
INFO[0008] Deleting filesystem...                       
INFO[0008] Base image from previous stage 0 found, using saved tar at path /kaniko/stages/0 
INFO[0008] Executing 0 build triggers                   
INFO[0008] Building stage 'base' [idx: '1', base-idx: '0'] 
INFO[0008] Checking for cached layer my.repo/kaniko-testing-cache:836b6a69360e27ebbcb45ba48cb7ebde66dce9f0b5e6ece466526f790a263d53... 
INFO[0009] No cached layer found for cmd RUN groupadd -g 1002 testie2 
INFO[0009] Unpacking rootfs as cmd RUN groupadd -g 1002 testie2 requires it. 
INFO[0009] RUN groupadd -g 1002 testie2                 
INFO[0009] Initializing snapshotter ...                 
INFO[0009] Taking snapshot of full filesystem...        
INFO[0011] Cmd: /bin/sh                                 
INFO[0011] Args: [-c groupadd -g 1002 testie2]          
INFO[0011] Running: [/bin/sh -c groupadd -g 1002 testie2] 
INFO[0011] Taking snapshot of full filesystem...        
INFO[0011] Pushing layer my.repo/kaniko-testing-cache:836b6a69360e27ebbcb45ba48cb7ebde66dce9f0b5e6ece466526f790a263d53 to cache now 
INFO[0011] Pushing image to my.repo/kaniko-testing-cache:836b6a69360e27ebbcb45ba48cb7ebde66dce9f0b5e6ece466526f790a263d53 
INFO[0012] Pushed my.repo/kaniko-testing-cache@sha256:d59ec9143f95ad41f88680ff94a5bdad2e8f13bd1b813090cd12430f97830d88 
INFO[0012] Skipping push to container registry due to --no-push flag 

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
  • - [ No ]
Please check if the build works in docker but not in kaniko
  • - [ Yes ]
Please check if this error is seen when you use --cache flag
  • - [ Yes ]
Please check if your dockerfile is a multistage dockerfile
  • - [ Yes ]

euven avatar Mar 26 '24 06:03 euven

I also suspect this is my issue when using LABEL in Dockerfiles - seems to miss the cache every time as the labels change on every build in my case

jameswilliams1 avatar May 03 '24 12:05 jameswilliams1

In our case it's even worse, we removed all the labels yet still run into the same issue, the reason, the Created timestamp.

image

Every time the base-image gets built, a new image with the identical layers is created, because it contains an updated timestamp. This is not tragic, but kaniko-cache is sensitive to the image-sha, not the sha of the layers. So when this new image is used as a base, the entire cache is lost, even though nothing changed in the base image.

note that we pass the base image as a build-arg:

FROM upstream/image:version AS build

ARG BUILD_IMAGE=build
FROM $BUILD_IMAGE AS production
--build-arg BUILD_IMAGE=local/build:version

but the arg does not contain the shasum, so nothing should change in the environment.

also note that I'm aware of the --reproducible flag and it indeed solves the issue as now we get the exact same sha every time, however that is unacceptable to us as our registry cleanup policy relies on the created time being set correctly. So in my eyes it's not an issue with the image we produce but rather with what goes into the cache-key on kaniko's side.

mzihlmann avatar Oct 10 '24 08:10 mzihlmann

i guess the issue is here https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/executor/build.go#L301

mzihlmann avatar Oct 10 '24 09:10 mzihlmann

for my case at least this could be enough? https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/executor/build.go#L116

+ sourceImageNoTimestamps, err := mutate.CreatedAt(sourceImage, v1.Time{})
+ if err != nil {
+ 	return nil, err
+ }
+ digest, err := sourceImageNoTimestamps.Digest()
- digest, err := sourceImage.Digest()

of course this could be extended to also get rid of labels etc.

mzihlmann avatar Oct 10 '24 09:10 mzihlmann

that seems to fix the timestamp issue on my side, I think I can go ahead and open a PR, maybe also fix the labels issue, although I think that should be two separate, dependent PRs, as the maintainers might want independent judgement on whether to include labels or not.

mzihlmann avatar Oct 10 '24 11:10 mzihlmann

this fixes the issues with the labels causing cache misses in my setup, @euven could you please verify?

+ cf, err := sourceImage.ConfigFile()
+ if err != nil {
+ 	return nil, err
+ }
+ cfg := cf.DeepCopy()
+ cfg.Created = v1.Time{}
+ cfg.Config.Labels = map[string]string{}
+ sourceImageReproducible, err := mutate.ConfigFile(sourceImage, cfg)
+ if err != nil {
+ 	return nil, err
+ }
+ 
+ digest, err := sourceImageReproducible.Digest()
- digest, err := sourceImage.Digest()

mzihlmann avatar Oct 10 '24 12:10 mzihlmann