Misleading error when docker is not installed
Output of helm version: version.BuildInfo{Version:"v3.10.1", GitCommit:"9f88ccb6aee40b9a0535fcc7efea6055e1ef72c9", GitTreeState:"clean", GoVersion:"go1.18.7"}
Output of kubectl version: WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.17", GitCommit:"22a9682c8fe855c321be75c5faacde343f909b04", GitTreeState:"clean", BuildDate:"2023-08-23T23:44:35Z", GoVersion:"go1.20.7", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.17", GitCommit:"22a9682c8fe855c321be75c5faacde343f909b04", GitTreeState:"clean", BuildDate:"2023-08-23T23:37:25Z", GoVersion:"go1.20.7", Compiler:"gc", Platform:"linux/amd64"}
Cloud Provider/Platform (AKS, GKE, Minikube etc.): onprem baremetal cluster
While trying to connect to Azure hosted OCI. We kept seeing this error: (redacted the private info.)
k8cp-01:~$ helm upgrade product oci://ourreg.azurecr.io/helm/product-prod --version 3.14.22-release.19 --install --namespace myproduct --create-namespace --values ./values-oci.yaml --debug
history.go:56: [debug] getting history for release product
Release "product" does not exist. Installing it now.
install.go:192: [debug] Original chart version: "2.12.22-release.13"
DEBU[0000] resolving host=ourreg.azurecr.io
DEBU[0000] do request host=ourreg.azurecr.io request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=Helm/3.10.1 request.method=HEAD url="https://ourreg.azurecr.io/v2/helm/product-prod/manifests/2.12.22-release.13"
DEBU[0000] fetch response received host=ourreg.azurecr.io response.header.access-control-expose-headers=Docker-Content-Digest response.header.access-control-expose-headers.1=WWW-Authenticate response.header.access-control-expose-headers.2=Link response.header.access-control-expose-headers.3=X-Ms-Correlation-Request-Id response.header.connection=keep-alive response.header.content-length=212 response.header.content-type="application/json; charset=utf-8" response.header.date="Mon, 13 Nov 2023 09:56:58 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.server=openresty response.header.strict-transport-security="max-age=31536000; includeSubDomains" response.header.strict-transport-security.1="max-age=31536000; includeSubDomains" response.header.www-authenticate="Bearer realm=\"https://composerimage01.azurecr.io/oauth2/token\",service=\"composerimage01.azurecr.io\",scope=\"repository:helm/composer-prod:pull\"" response.header.x-content-type-options=nosniff response.header.x-ms-correlation-request-id=f1116720-f13f-4ba2-921e-759dc0faf57b response.status="401 Unauthorized" url="https://composerimage01.azurecr.io/v2/helm/composer-prod/manifests/2.12.22-release.13"
DEBU[0000] Unauthorized header="Bearer realm=\"https://ourreg.azurecr.io/oauth2/token\",service=\"ourreg.azurecr.io\",scope=\"repository:helm/product-prod:pull\"" host=ourreg.azurecr.io
DEBU[0000] do request host=ourreg.azurecr.io request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=Helm/3.10.1 request.method=HEAD url="https://ourreg.azurecr.io/v2/helm/product-prod/manifests/2.12.22-release.13"
INFO[0000] trying next host error="failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized" host=ourreg.azurecr.io
Error: failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized
helm.go:84: [debug] failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized
The back story is simple. We setup the OCI repo and some charts in Azure from one one of our cluster servers, and a few tech laptops were involved as well. All the work was initially done in our ci environment. Due to the complexity of our 30 odd virtual subnets talking to our 20 odd real subnets, I don't want to share the exact nature of our networks if you don't mind. But we had a lot of issues routing and firewall policies before we got the laptops and servers to all talk to the OCI registry properly. And it was great when it worked.
Once we started to rollout the use of the OCI on different clusters and different control planes we hit this error over and over. But as we have so many clusters, and some would work and some spit this error out. I wanted to inform you that it has taken us months to figure out that the "401 unauthorised" error is not useful. And its really not an unauthorised issue at all.
The fix was figured out by total accident where we needed to do something for another namespace on one of the clusters that had the OCI unauthorised error, and suddenly the OCI started working, we we had left the broken deployment in the ErrImagePull or ImagePullBackOff state. All of sudden the pods start running.
The issue is that docker had not been installed. We have confirmed this on all our clusters now.
We did not think we needed docker because we are using containerd.
I think logically the error should rather say docker is not installed. Or missing executable, something a little more useful.
In the 3 months we have been banging our heads on the wall, none of us in the team have stumbled an any document stating docker is a dependency of helm.
And so it was assumed from the error that we have a firewall - access from subnet / access token or some kind of account issue.
I don't think that docker is required. I suspect it might just be a config file that's needed. I can't test that, however, without a reproducible issue.
Can you set up something that I can confirm that's not your proprietary property?
I do think I worked this out? The docker seems to be required on the Azure OCI repositories side. So when using helm to push and pull the host system seems to need it installed. I have no idea why?
Did you authenticate with ACR / the registry? docs: https://learn.microsoft.com/en-us/azure/container-registry/container-registry-helm-repos#authenticate-with-the-registry
Installing docker, and then authenticating with ACR for pulling images, might also allow Helm to successfully authn to the registry aswell.
Did you authenticate with ACR / the registry? docs: https://learn.microsoft.com/en-us/azure/container-registry/container-registry-helm-repos#authenticate-with-the-registry
Installing docker, and then authenticating with ACR for pulling images, might also allow Helm to successfully authn to the registry as well.
Yes. I get Login succeeded for both the Token and anonymous methods. But still get unexpected status: 401 Unauthorized when we run helm upgrade. I did test this yesterday on possibly the last cluster in the establishment and installing docker solved it again
The odd thing is that we created a copy of the suppliers repositories, which was docker, so used this script. Just showing the bit that does the work:
for source_repository in $source_repositories
do
echo "Copying images from $source_repository"
echo found tag $tag
# Pull the image from GBST repository
docker pull "$source_registry/$source_repository:$tag"
# Tag the image for ACR repository
docker tag "$source_registry/$source_repository:$tag" "$target_registry/$source_repository:$tag"
# Push the image to ACR repository
docker push "$target_registry/$source_repository:$tag"
# Remove local copies of the images
docker image rm "$source_registry/$source_repository:$tag"
docker image rm "$target_registry/$source_repository:$tag"
done
For the chart we used Helm to pull it an push into a repo in Azure
So basically its a bunch of docker images in our repo.
If you say helm doesn't have a dependency on docker. I am happy to say it must be something to do with our suppliers charts.
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.