Fail to pull image from local registry
/kind bug
What steps did you take and what happened: Installing and configurig kubeflow-fairing following the procedure on kubeflow website. Running the example from examples/simple/main.py, setting DOCKER_REGISTRY to localhost:32000
A fairing-job image is correctly pushed to the local registry, the job is started but the pod cannot pull the image from the local registry
curl -X GET http://localhost:32000/v2/fairing-job/tags/list
{"name":"fairing-job","tags":["640E50A"]}
kubectl describe pod/fairing-job-p8cf2-dcp6g
...
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 30m default-scheduler Successfully assigned default/fairing-job-p8cf2-dcp6g to k8s
Normal Pulling 29m (x4 over 30m) kubelet, k8s Pulling image "localhost:32000/fairing-job:640E50A"
Warning Failed 29m (x4 over 30m) kubelet, k8s Failed to pull image "localhost:32000/fairing-job:640E50A": rpc error: code = Unknown desc = failed to pull and unpack image "localhost:32000/fairing-job:640E50A": failed to copy: httpReaderSeeker: failed open: unexpected status code http://localhost:32000/v2/fairing-job/manifests/sha256:b92a770802696b2303556ff0ff6fb23340d6eb506c8a1080a9f06b12ef28725e: 500 Internal Server Error
Warning Failed 29m (x4 over 30m) kubelet, k8s Error: ErrImagePull
Warning Failed 5m37s (x110 over 30m) kubelet, k8s Error: ImagePullBackOff
Normal BackOff 34s (x132 over 30m) kubelet, k8s Back-off pulling image "localhost:32000/fairing-job:640E50A"
What did you expect to happen: The image must be pulled from the local registry and the job should complete without issue.
Anything else you would like to add:
I'm working with microk8s registry add-on. I have verified that using the procedure desribed at https://microk8s.io/docs/working, I can deploy some pods using images pulled from the local registry.
Environment:
- Fairing version: (use
python -c "import kubeflow.fairing; print(kubeflow.fairing.__version__)"): 0.6.0 - Kubeflow version: (version number can be found at the bottom left corner of the Kubeflow dashboard): 0.6.2
- Minikube version: microk8s 1.15.3
- Kubernetes version: (use
kubectl version): 1.15.3 - OS (e.g. from
/etc/os-release):Ubuntu 18.04.3 LTS
Issue-Label Bot is automatically applying the label kind/bug to this issue, with a confidence of 0.99. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!
Links: app homepage, dashboard and code for this bot.
@benoitdr personally I think the problem is not related with kubeflow-faring, but kubenertes, could you please try the create the job manually with assioated with the image in localhost:32000, see if can be started? Thanks.
@jinchihe, Thanks for your suggestion. How can I create the job manually ? Is there a way to use kubeflow-fairing to generete a yaml file for it ?
I mean just create a sample job to test the your local docker hub :-)
yes that's working. Following the procedure from https://microk8s.io/docs/working, I can deploy an nginx image from the local registry.
@benoitdr that's strange... I think that's same with nginx job here, seems nothing with kubeflow-fairing here.
@jinchihe I'm not sure. I can pull images from a local registry (and from hub.docker.com) but for some reason kubeflow cannot do it. It might be a common issue with https://github.com/kubeflow/fairing/issues/382
@benoitdr , In my mind, maybe in your k8s cluster you have used a default docker registry Registry: https://index.docker.io/v1/. if you want to pull the image in your pod from your private repository, you need to to login with it firstly, or change your default ones. Thanks.
It's not a login issue. I think it's related to microk8s. See https://github.com/ubuntu/microk8s/issues/681
I hit by the same problem(you can pull images from dockerhub but not locally) with kubernetes in docker. As a workaround if you set the pull policy to Never then it will be forced to use the local images. Not sure if we have option to pass the pull policy value in fairing. I will check that later.
/area example /priority p2
Using microk8s > 1.13 will hit this error since it uses microk8s.ctr and dockerd is replaced with containerd.
'Append builder' calls Layer Class method originally from containerregistry, however fairing has an older version. See append_.py code difference: https://github.com/google/containerregistry/blob/8a11dc8c53003ecf5b72ffaf035ba280109356ac/client/v2_2/append_.py#L68
I've tried to change 'mediaType' to 'docker_http.LAYER_MIME' in fairing code, but still not work. The image manifest or digest seems not compatible. Need to check with containerregistry if containerd style image is supported and can be built with Layer Class method.
How do you feel? @jinchihe