helm icon indicating copy to clipboard operation
helm copied to clipboard

Misleading error when docker is not installed

Open chadleywilson opened this issue 2 years ago • 5 comments

Output of helm version: version.BuildInfo{Version:"v3.10.1", GitCommit:"9f88ccb6aee40b9a0535fcc7efea6055e1ef72c9", GitTreeState:"clean", GoVersion:"go1.18.7"}

Output of kubectl version: WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version. Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.17", GitCommit:"22a9682c8fe855c321be75c5faacde343f909b04", GitTreeState:"clean", BuildDate:"2023-08-23T23:44:35Z", GoVersion:"go1.20.7", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v4.5.4 Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.17", GitCommit:"22a9682c8fe855c321be75c5faacde343f909b04", GitTreeState:"clean", BuildDate:"2023-08-23T23:37:25Z", GoVersion:"go1.20.7", Compiler:"gc", Platform:"linux/amd64"}

Cloud Provider/Platform (AKS, GKE, Minikube etc.): onprem baremetal cluster

While trying to connect to Azure hosted OCI. We kept seeing this error: (redacted the private info.)

k8cp-01:~$ helm upgrade product oci://ourreg.azurecr.io/helm/product-prod --version 3.14.22-release.19 --install --namespace myproduct --create-namespace --values ./values-oci.yaml --debug
history.go:56: [debug] getting history for release product 
Release "product" does not exist. Installing it now.
install.go:192: [debug] Original chart version: "2.12.22-release.13"
DEBU[0000] resolving                                     host=ourreg.azurecr.io
DEBU[0000] do request                                    host=ourreg.azurecr.io request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=Helm/3.10.1 request.method=HEAD url="https://ourreg.azurecr.io/v2/helm/product-prod/manifests/2.12.22-release.13"
DEBU[0000] fetch response received                       host=ourreg.azurecr.io response.header.access-control-expose-headers=Docker-Content-Digest response.header.access-control-expose-headers.1=WWW-Authenticate response.header.access-control-expose-headers.2=Link response.header.access-control-expose-headers.3=X-Ms-Correlation-Request-Id response.header.connection=keep-alive response.header.content-length=212 response.header.content-type="application/json; charset=utf-8" response.header.date="Mon, 13 Nov 2023 09:56:58 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.server=openresty response.header.strict-transport-security="max-age=31536000; includeSubDomains" response.header.strict-transport-security.1="max-age=31536000; includeSubDomains" response.header.www-authenticate="Bearer realm=\"https://composerimage01.azurecr.io/oauth2/token\",service=\"composerimage01.azurecr.io\",scope=\"repository:helm/composer-prod:pull\"" response.header.x-content-type-options=nosniff response.header.x-ms-correlation-request-id=f1116720-f13f-4ba2-921e-759dc0faf57b response.status="401 Unauthorized" url="https://composerimage01.azurecr.io/v2/helm/composer-prod/manifests/2.12.22-release.13"
DEBU[0000] Unauthorized                                  header="Bearer realm=\"https://ourreg.azurecr.io/oauth2/token\",service=\"ourreg.azurecr.io\",scope=\"repository:helm/product-prod:pull\"" host=ourreg.azurecr.io
DEBU[0000] do request                                    host=ourreg.azurecr.io request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=Helm/3.10.1 request.method=HEAD url="https://ourreg.azurecr.io/v2/helm/product-prod/manifests/2.12.22-release.13"
INFO[0000] trying next host                              error="failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized" host=ourreg.azurecr.io
Error: failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized
helm.go:84: [debug] failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized

The back story is simple. We setup the OCI repo and some charts in Azure from one one of our cluster servers, and a few tech laptops were involved as well. All the work was initially done in our ci environment. Due to the complexity of our 30 odd virtual subnets talking to our 20 odd real subnets, I don't want to share the exact nature of our networks if you don't mind. But we had a lot of issues routing and firewall policies before we got the laptops and servers to all talk to the OCI registry properly. And it was great when it worked.

Once we started to rollout the use of the OCI on different clusters and different control planes we hit this error over and over. But as we have so many clusters, and some would work and some spit this error out. I wanted to inform you that it has taken us months to figure out that the "401 unauthorised" error is not useful. And its really not an unauthorised issue at all.

The fix was figured out by total accident where we needed to do something for another namespace on one of the clusters that had the OCI unauthorised error, and suddenly the OCI started working, we we had left the broken deployment in the ErrImagePull or ImagePullBackOff state. All of sudden the pods start running.

The issue is that docker had not been installed. We have confirmed this on all our clusters now.

We did not think we needed docker because we are using containerd.

I think logically the error should rather say docker is not installed. Or missing executable, something a little more useful.

In the 3 months we have been banging our heads on the wall, none of us in the team have stumbled an any document stating docker is a dependency of helm.

And so it was assumed from the error that we have a firewall - access from subnet / access token or some kind of account issue.

chadleywilson avatar Nov 14 '23 09:11 chadleywilson

I don't think that docker is required. I suspect it might just be a config file that's needed. I can't test that, however, without a reproducible issue.

Can you set up something that I can confirm that's not your proprietary property?

joejulian avatar Nov 14 '23 17:11 joejulian

I do think I worked this out? The docker seems to be required on the Azure OCI repositories side. So when using helm to push and pull the host system seems to need it installed. I have no idea why?

chadleywilson avatar Nov 15 '23 09:11 chadleywilson

Did you authenticate with ACR / the registry? docs: https://learn.microsoft.com/en-us/azure/container-registry/container-registry-helm-repos#authenticate-with-the-registry

Installing docker, and then authenticating with ACR for pulling images, might also allow Helm to successfully authn to the registry aswell.

gjenkins8 avatar Nov 16 '23 17:11 gjenkins8

Did you authenticate with ACR / the registry? docs: https://learn.microsoft.com/en-us/azure/container-registry/container-registry-helm-repos#authenticate-with-the-registry

Installing docker, and then authenticating with ACR for pulling images, might also allow Helm to successfully authn to the registry as well.

Yes. I get Login succeeded for both the Token and anonymous methods. But still get unexpected status: 401 Unauthorized when we run helm upgrade. I did test this yesterday on possibly the last cluster in the establishment and installing docker solved it again

The odd thing is that we created a copy of the suppliers repositories, which was docker, so used this script. Just showing the bit that does the work:

for source_repository in $source_repositories 
do
  echo "Copying images from $source_repository"  
        echo found tag $tag
        # Pull the image from GBST repository
            docker pull "$source_registry/$source_repository:$tag"    
            # Tag the image for ACR repository
            docker tag "$source_registry/$source_repository:$tag" "$target_registry/$source_repository:$tag"    
            # Push the image to ACR repository
            docker push "$target_registry/$source_repository:$tag"    
            # Remove local copies of the images
            docker image rm "$source_registry/$source_repository:$tag"
            docker image rm "$target_registry/$source_repository:$tag"
      
done

For the chart we used Helm to pull it an push into a repo in Azure

So basically its a bunch of docker images in our repo.

If you say helm doesn't have a dependency on docker. I am happy to say it must be something to do with our suppliers charts.

chadleywilson avatar Nov 17 '23 08:11 chadleywilson

This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.

github-actions[bot] avatar Feb 16 '24 00:02 github-actions[bot]