buildkit icon indicating copy to clipboard operation
buildkit copied to clipboard

buildkit on kubernetes fails to trust CA certificate even when it is properly added to container no matter what is added to config.toml

Open puckettgw opened this issue 1 year ago • 6 comments

I'm running into this issue even with the config available and explicitly defined: I set up the builder using the CLI:

docker --tlscacert=/etc/ssl/certs/ca-certificates.crt buildx create --bootstrap --name=kube --driver=kubernetes --driver-opt=namespace=something --driver-opt=image=registry.somewhere/buildkit:2.0 --buildkitd-flags="--config=/etc/buildkit/buildkitd.toml"

The /etc/buildkit/buildkit.toml looks like this:

debug = true
trace = true

insecure-entitlements = ["network.host", "security.insecure"]

[registry. "registry.somewhere"]
http = true
insecure = true
ca=["/usr/local/share/ca-certificates/ca.crt"]

[registry. "registry.somewhere:443"]
http = true
insecure = true
ca=["/usr/local/share/ca-certificates/ca.crt"]

The CA cert and config toml are baked into the image at the expected locations. I tried updating the local trust store via update-ca-certificates and validated that the cert was properly appended to the trust store, but it still didn't work. So that is why I directly pointed to the CA location like this.

The buildkit container shows the process running with the config file:

kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
PID   USER     TIME  COMMAND
    1 root      0:19 buildkitd --config=/etc/buildkit/buildkitd.toml --allow-insecure-entitlement=network.host

However, when I attempt to build and push an image via this command: docker --tlscacert=/etc/ssl/certs/ca-certificates.crt buildx build . --builder=kube -t registry.somewhere/someimage:sometag --push

I receive the following error message:

ERROR: failed to solve: failed to push registry.somewhere/someimage:4be66a33c47f0af8a6fa02e5913a734d8a6028eb: failed to authorize: failed to fetch anonymous token: Get "https://registry.somewhere/v2/token?scope=%2A%3A%3A&scope=repository%3A***%2Fsomeimage%3Apull%2Cpush&service=container_registry": tls: failed to verify certificate: x509: certificate signed by unknown authority

When I look in the logs for the pod, I notice the following:

194 0.19.1-1 /usr/libexec/docker/cli-plugins/docker-buildx --tlscacert=/etc/ssl/certs/ca-certificates.crt buildx build . --builder=kube -t registry.somewhere/george/someimage:4be66a33c47f0af8a6fa02e5913a734d8a6028eb --push
github.com/moby/buildkit/session/auth/authprovider.(*authProvider).FetchToken
	/build/src/vendor/github.com/moby/buildkit/session/auth/authprovider/authprovider.go:143
github.com/moby/buildkit/session/auth._Auth_FetchToken_Handler.func1
	/build/src/vendor/github.com/moby/buildkit/session/auth/auth_grpc.pb.go:166
github.com/moby/buildkit/session/auth._Auth_FetchToken_Handler
	/build/src/vendor/github.com/moby/buildkit/session/auth/auth_grpc.pb.go:168
google.golang.org/grpc.(*Server).processUnaryRPC
	/build/src/vendor/google.golang.org/grpc/server.go:1394
google.golang.org/grpc.(*Server).handleStream
	/build/src/vendor/google.golang.org/grpc/server.go:1805
google.golang.org/grpc.(*Server).serveStreams.func2.1
	/build/src/vendor/google.golang.org/grpc/server.go:1029
runtime.goexit
	/usr/local/go/src/runtime/asm_arm64.s:1222

1 v0.17.3 buildkitd --config=/etc/buildkit/buildkitd.toml --allow-insecure-entitlement=network.host
main.unaryInterceptor
	/src/cmd/buildkitd/main.go:717
google.golang.org/grpc.NewServer.chainUnaryServerInterceptors.chainUnaryInterceptors.func1
	/src/vendor/google.golang.org/grpc/server.go:1202
github.com/moby/buildkit/api/services/control._Control_Solve_Handler
	/src/api/services/control/control_grpc.pb.go:289
google.golang.org/grpc.(*Server).processUnaryRPC
	/src/vendor/google.golang.org/grpc/server.go:1394
google.golang.org/grpc.(*Server).handleStream
	/src/vendor/google.golang.org/grpc/server.go:1805
google.golang.org/grpc.(*Server).serveStreams.func2.1
	/src/vendor/google.golang.org/grpc/server.go:1029
runtime.goexit
	/usr/local/go/src/runtime/asm_arm64.s:1222

I am using the buildkit 0.17.3 image. I noticed the first block in the trace shows a 0.19.1-1 version of buildx -- I have not specified that version anywhere. It also doesn't seem to actually exist within the container so it's being pulled from somewhere and executed. Why? Where is it coming from, and is it being executed with different arguments that ignore my configuration or ca certs? This behavior still occurs in newer images as well, I just tried an older one to see if this problem was introduced in the latest image

Whether it's the cause or not, the issue remains -- a CA certificate that is trusted in the host and the buildkit container and explicitly defined as the CA cert to use for the repository is not being trusted by buildkit when attempting to push images.

puckettgw avatar Dec 06 '24 20:12 puckettgw

@puckettgw did you solve the issue? Struggling with the same exact issue. On a vm it works but not in a k8s pod

minasanastasi avatar Aug 02 '25 07:08 minasanastasi

I was experiencing this problem. My setup was BuildKitd running as a Deployment and buildkitctl connecting to it remotely using TLS. What fixed it for me is trusting the CA in the container running buildkitctl. Not sure if it’s even necessary to have that trust established in the daemon container, but I left it in anyways. Hope this helps.

achernev avatar Aug 02 '25 07:08 achernev

Will try that but supposedly buildctl sends the pull or push to be done from the buildkit daemon container so i expect tls to happen at daemon. Thanks man

minasanastasi avatar Aug 02 '25 08:08 minasanastasi

That’s what I thought, too, and it took me a long time to get desperate enough to try that.

achernev avatar Aug 02 '25 08:08 achernev

Indeed i can confirm that it works. Added the my ca on the container running the buildctl and works like a charm. Tried it again but this time removed the ca from daemon and faild even earlier on pull of base image. So actually both needs the CA. Honestly spend hours on the issue and it was so simple.

minasanastasi avatar Aug 02 '25 12:08 minasanastasi

That’s what I thought, too, and it took me a long time to get desperate enough to try that.

It took me two days, sigh.

I was experiencing this problem. My setup was BuildKitd running as a Deployment and buildkitctl connecting to it remotely using TLS. What fixed it for me is trusting the CA in the container running buildkitctl. Not sure if it’s even necessary to have that trust established in the daemon container, but I left it in anyways. Hope this helps.

I confirmed this requirement/behavior today. That's a crazy troubleshooting experience.

Self signed certificates must be trusted on buildkitd and buildctl client side together.

You can do a cert installation as usual.

guhuajun avatar Sep 18 '25 01:09 guhuajun