buildx icon indicating copy to clipboard operation
buildx copied to clipboard

Build Multi Arch Image Error: server message: insufficient_scope: authorization failed

Open wangcanfengxs opened this issue 1 year ago • 11 comments

Contributing guidelines

I've found a bug and checked that ...

  • [x] ... the documentation does not mention anything about my problem
  • [x] ... there are no open or closed issues that are related to my problem

Description

docker buildx build --pull -f ./Dockerfile --build-arg BASE_IMAGE=harbor.example.com/devops/jdk8-openjdk-skiff:v1.6 --platform linux/amd64,linux/arm64 --push . -t harbor.example.com/lctest/test:4dcd8eac

Expected behaviour

push successfully

Actual behaviour

#23 [auth] lcdev/playwright-java8:pull lctest/test:pull,push token for harbor.example.com [2024-03-28 10:46:05] #23 sha256:d0078f5ccd4e4b94fbcb708aceebf38d48339d9d0fb6e5a308bc05fc0860e58c [2024-03-28 10:46:05] #23 DONE 0.0s [2024-03-28 10:46:05] [2024-03-28 10:46:05] #24 [auth] lcdev/playwright-java8:pull lctest/test:pull,push token for harbor.example.com [2024-03-28 10:46:05] #24 sha256:d6813da4a1ba3d97cfaac03368aca522c1145b52ed917f1a1034fab9dea8f8f0 [2024-03-28 10:46:05] #24 DONE 0.0s [2024-03-28 10:46:05] [2024-03-28 10:46:05] #21 exporting to image [2024-03-28 10:46:05] #21 sha256:d1f58143e758915860091f89ba9a815b655303b4a9da7e6ca9b9aa917ab657f9 [2024-03-28 10:46:05] #21 pushing layers 0.4s done [2024-03-28 10:46:05] #21 ERROR: failed to push harbor.example.com/lctest/test: server message: insufficient_scope: authorization failed [2024-03-28 10:46:05] ------ [2024-03-28 10:46:05] > exporting to image: [2024-03-28 10:46:05] ------ [2024-03-28 10:46:05] error: failed to solve: rpc error: code = Unknown desc = failed to push harbor.cloud.netease.com/qztest/cicd-testng:4dcd8eac: server message: insufficient_scope: authorization failed

Buildx version

github.com/docker/buildx v0.11.2 9872040b6626fb7d87ef7296fd5b832e8cc2ad17

Docker info

Client:
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.11.2)

Server:
 Containers: 37
  Running: 16
  Paused: 0
  Stopped: 21
 Images: 141
 Server Version: 19.03.13
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8fba4e9a7d01810a393d5d25a3621dc101981175
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.268-1.el7.elrepo.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.62GiB
 Name: node20240130-015
 ID: YPCI:5MHR:M67F:PBWU:YTF5:GXOA:KTMR:VQW6:4CRC:TDAF:5Q37:Q5YH
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 110
  Goroutines: 105
  System Time: 2024-03-28T10:55:28.268585796+08:00
  EventsListeners: 0
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: true
 Insecure Registries:
  harbor.example.com
  127.0.0.0/8
 Live Restore Enabled: true

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Builders list

NAME/NODE                      DRIVER/ENDPOINT             STATUS  BUILDKIT       PLATFORMS
cicd_multi_platform_builder *  docker-container
  cicd_multi_platform_builder0 unix:///var/run/docker.sock running v0.12.5        linux/amd64*, linux/amd64/v2*, linux/amd64/v3*, linux/arm64*, linux/riscv64*, linux/ppc64le*, linux/s390x*, linux/386*, linux/mips64le*, linux/mips64*, linux/arm/v7*, linux/arm/v6*
default                        docker
  default                      default                     running v0.6.4+df89d4d linux/amd64, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v6

Configuration

none

Build logs

No response

Additional info

No response

wangcanfengxs avatar Mar 28 '24 02:03 wangcanfengxs

I find when push image,buildx will auth another repo

#24 [auth] lcdev/playwright-java8:pull lctest/cicd-testng:pull,push token for harbor.example.com

I indeed no permission for lcdev repo. But why buildx would check pull permission for lcdev/playwright-java8

wangcanfengxs avatar Mar 28 '24 03:03 wangcanfengxs

got the same issue that I could reproduce 100%:

  1. create a registry A, build and push an image on registry A -> OK
  2. remove registry A
  3. create a registry B, build and push the same image as before but tag with registry B -> KO (asking for a token to pull from registry A which fails as registry A is no longer allowed)

nox-404 avatar Apr 08 '24 13:04 nox-404

The only way I could find was to purge any docker buildx cache

nox-404 avatar Apr 08 '24 13:04 nox-404

When using a docker distribution with an S3 backend I get this call which respond with 401 and asks for a bearer for both repo: https://[my-s3-api]/[repository-b/image-b]/blobs/uploads/?mount=sha256:[somesha256]&from=[repository-a/image-a]

nox-404 avatar Apr 08 '24 13:04 nox-404

I'm facing the same issue.

The only way I could find was to purge any docker buildx cache

It's working but I can't implement it that way in my situation.

https://[my-s3-api]/[repository-b/image-b]/blobs/uploads/?mount=sha256:[somesha256]&from=[repository-a/image-a]

Looks like there is only one token asked and used by buildx while two different repositories with different permissions are used.

Sryther avatar Apr 16 '24 14:04 Sryther

I find when push image,buildx will auth another repo

#24 [auth] lcdev/playwright-java8:pull lctest/cicd-testng:pull,push token for harbor.example.com

I indeed no permission for lcdev repo. But why buildx would check pull permission for lcdev/playwright-java8

@wangcanfengxs, I think buildx tries to pull cached layers from lcdev repo before pushing to lctest because it probably already pushed some of them before.

Sryther avatar Apr 16 '24 14:04 Sryther

I have the same issue.

#15 exporting to image
#15 exporting layers
#15 exporting layers 0.9s done
#15 ...
#16 [auth] <my-project>:pull,push token for <my-registry>
#16 DONE 0.0s
#15 exporting to image
#15 exporting manifest sha256:8bc3b8395b9cdff49df4f1bc38634db4dbe156d0f3f53f9b619913c18e24ce37 0.0s done
#15 exporting config sha256:d6d710ce8bb4d88dc72d72083e3141d16657a2c7fb74c12347ffe2f214d73461 0.0s done
#15 pushing layers
#15 ...
#17 [auth] <other-project>:pull <my-project>:pull,push token for <my-registry>
#17 DONE 0.0s
#18 [auth] <other-project>pull <my-project>:pull,push token for <my-registry>
#18 DONE 0.0s
#15 exporting to image
#15 pushing layers 1.4s done
#15 ERROR: failed to push <my-project>:<my-tag>: server message: insufficient_scope: authorization failed

buildx has the correct auth token at #16, but somewhere caches a completely different token from another project, that don't even use the same base image (flutter vs python project).

#17 [auth] <other-project>:pull <my-project>:pull,push token for

Here it has both tokens, but it looks like only the first is tried, otherwise a the push would work.

In my case I create a custom runner to run buildx on an arm64 runner for arm64 builds:

$ docker buildx create  --name local_remote_builder  --node local_builder  --platform linux/amd64,linux/386  --driver-opt env.BUILDKIT_STEP_LOG_MAX_SIZE=10000000  --driver-opt env.BUILDKIT_STEP_LOG_MAX_SPEED=10000000
$ docker buildx create --name local_remote_builder --append --node arm64_builder --platform linux/arm64,linux/arm/v8,linux/armv7,linux/arm/v6 --driver-opt env.BUILDKIT_STEP_LOG_MAX_SIZE=10000000 --driver-opt env.BUILDKIT_STEP_LOG_MAX_SPEED=10000000 ssh://user@host
$ docker buildx use local_remote_builder
$ docker buildx build --progress plain --platform ${PLATFORMS} ${EXTRA_ARGS} ${DESTINATIONS} --push --provenance=false .

trivialkettle avatar Nov 28 '24 08:11 trivialkettle

I also see this using registry 2.8.3 with keycloak as authenticator. If you build an image for repoA, and after build another image for repoB that shares a layer with repoA, buildx will fetch 2 auth tokens:

  1. repoB:pull,push
  2. repoA:pull repoB:pull,push

The problem is that the auth token the the buildx_buildkit uses is always the first one, and when it tries to fetch (pull) repoA layer it fails with the authorization scope error because it is using the first auth token, instead of the second one.

This issue is not multi arch specific, but for a registry that uses auth tokens (scoped).

In my example the logs show the authentication is ok, but then it fails with the insuficient scope message ` (...) #8 [auth] test-4:pull,push token for 172.31.1.30:5000 #8 DONE 0.0s

#9 [auth] test-2:pull test-4:pull,push token for 172.31.1.30:5000 #9 DONE 0.0s

(...)

#7 pushing layers #7 pushing layers 0.1s done #7 ERROR: failed to push 172.31.1.30:5000/test-4:2.2: server message: insufficient_scope: authorization failed `

I have a packet capture, where I can see that both auth token are received by the buildx process, but the buildkit container then makes the POST request only with the first auth token, not with the second one, in all its retries.

The issue seems to be that either buildx shouldn't try to pull the "shared" layer from repoA and use the local cache, or the push needs to use the 2nd auth token received instead of only trying the first one.

Doing a docker buildx prune solves the issue, as it won't try to use the shared layer.

I can provide a docker compose file to bootstrap a registry with keycloak integration to replicate the issue.

zauwn avatar Jan 27 '25 11:01 zauwn