Latest docker release (20.10.0) doesn't work with docker-machine
Hey there! I've noticed that the latest docker release (20.10.0 as of this writing) breaks with docker-machine's create phase. The following no longer works:
$ docker-machine create --driver digitalocean --digitalocean-access-token REDACTED --engine-storage-driver overlay2 --digitalocean-image ubuntu-18-04-x64 --digitalocean-size s-1vcpu-1gb --digitalocean-region sfo2 dm-create-test
Running pre-create checks...
Creating machine...
(dm-create-test) Creating SSH key...
(dm-create-test) Creating Digital Ocean droplet...
(dm-create-test) Waiting for IP address to be assigned to the Droplet...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded
This has failed for about the last few hours. Seems like pinning to a custom engine URL works, but I imagine most folks will use this with a broken experience across all drivers. I've currently worked around this by setting:
--engine-install-url "https://releases.rancher.com/install-docker/19.03.9.sh"
https://get.docker.com/ is the default install URL, fwiw
Interestingly, I just tried with::
docker-machine create --driver google --google-project blablabla --google-machine-image https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/ubuntu-1804-bionic-v20201201 anyname
And it worked fine.
After doing docker-machine ssh anyname, I get:
sudo docker version
Client: Docker Engine - Community
Version: 20.10.0
...
Server: Docker Engine - Community
Engine:
Version: 20.10.0
I tried with 18, 19 and 20 local Docker versions. Docker machine is 0.16.0.
Do you see anything I'm missing (except that you used DO and I used GCP)?
Thanks for checking this out. I haven't tried with other providers, just DO, so the problem might be there. Will check again later
Looks like, according to https://github.com/JonasProgrammer/docker-machine-driver-hetzner/issues/54, that Hetzner doesn't work as well.
Interestingly, I just tried with::
docker-machine create --driver google --google-project blablabla --google-machine-image https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/ubuntu-1804-bionic-v20201201 anynameAnd it worked fine.
After doing
docker-machine ssh anyname, I get:sudo docker version Client: Docker Engine - Community Version: 20.10.0 ... Server: Docker Engine - Community Engine: Version: 20.10.0
Querying the docker version will work but attempt to run a container on that box now and it will fail.
@cardoe, not really:
$ docker-machine ssh anothertest
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1030-gcp x86_64)
...
Last login: Wed Dec 9 18:10:01 2020 from 34.90.202.113
docker-user@anothertest:~$ sudo docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
docker-user@anothertest:~$
Confirm, latest update broke docker-machine create.
Confirm as well
Me too, my workaround for gitlab runner and AWS driver was also to use engine-install-url with a fixed version script.
docker-machine -D create ... --driver ...
-D enable debug mode, so we can see where it breaks.
In the case of rackspace (and I suspect digitalocean as well), we discovered that the system was not allowing external connections to the docker service because the firewall was blocking the port 2376.
As a workaround we created an image with an additional firewall rule to allow 2376 on an external interface. Using that image, we were able to create a machine successfully.
Best would be to have the provision files updated with the proper fixes.
In my case (Gitlab runner at DigitalOcean, Ubuntu 20.04 LTS) the update to Docker 20.10.0 broke the runners as well. After a bit of debugging it seemed to me as if the docker daemon was only listening on the socket, not the port, when docker-machine started checking for the running docker daemon. According to the logs, half a minute after docker-machine gave up (10 retries), the docker daemon was restarted and also listened on port 2376.
I connected locally to the docker daemon and was able to run containers.
Haven't investigated further and just changed to Docker 19.03 with the parameter suggested bei @joelgriffith which fixes the issue for now.
This may be related to https://github.com/moby/moby/issues/41767 (systemctl docker start get stuck in cloud-init), for which we'll be publishing updated packages
@cardoe, not really:
$ docker-machine ssh anothertest
Use eval $(docker-machine env anothertest) and you'll see it's broken. You need to connect the same way that the rest of the system will connect. by SSHing in you're using the UNIX socket. Attempting to do this via the TCP socket will show you the issue.
As far as I know docker-machine is deprecated, and won't be updated anymore: see https://github.com/docker/machine/issues/4537
There will be no further releases of the boot2docker.iso either: https://github.com/boot2docker/boot2docker/pull/1408
So Docker 19.03 is the final release, unless Docker Inc. changes their minds about the projects...
We will have a fork of libmachine inside minikube, but the machine-drivers will not be compatible.
In the case of rackspace (and I suspect digitalocean as well), we discovered that the system was not allowing external connections to the docker service because the firewall was blocking the port 2376.
I have done some work for minikube, to switch from the old tcp to the new ssh connections.
This could be ported to docker-machine, if the project finds a new life somewhere else ? Like moby-machine and boot2moby, or whatever. Something that is accepting patches...
Basically it uses ssh:// for the docker host, but also need to use ssh-add for the keys.
I think I ran into this problem. I get the following error:
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded
My investigation shows that the docker daemon started on the target machine, listens only on the docker socket.
A restart of the docker daemon "at the right" time allows the reload the config and listen on the expected port.
Until v20.10 supported is fixed, I've added a VERSION=19.03 in the --engine-install-url. I suppose using the rancher URL would work as well.
For people running gitlab-runner with docker+machine executor, what is frustrating is that there is no error in the docker-machine ls output.
@Miouge1 Whats the syntax for engine-install-url? Could you provide a complete example? Thanks!
@webflo there's an example in the first post in this issue
Docker 20.10.1 packages are now available on download.docker.com, and may solve this issue; perhaps someone could give it a test to verify if the issue is resolved?
Docker 20.10.1 has the same problem - docker-machine writes the systemd unit file, calls daemon-reload, but for some reason ps ax |grep dockerd shows that the running dockerd has a blank -H argument. If you ssh in and restart docker manually, it comes up with the correct arguments.
This could be ported to docker-machine, if the project finds a new life somewhere else ? Like moby-machine and boot2moby, or whatever. Something that is accepting patches...
Gitlab has a fairly active fork that releases fixes for various issues: https://gitlab.com/gitlab-org/docker-machine
It's one of the main executors for their test runners and it's still having this issue, so hopefully they'd accept a PR with a fix.
@tsnowlan : as far as I know, both https://github.com/machine-drivers/machine and https://gitlab.com/gitlab-org/ci-cd/docker-machine are for bugfixes only (not for development)
https://docs.gitlab.com/runner/executors/docker_machine.html#forked-version-of-docker-machine
The intent of this fork is to fix critical and bugs affecting running costs only. No new features will be added.
Things like using ssh transport for docker or adding support for ssh host keys are not really bug fixes.
So that is why it needed to start over, outside Docker. For now, just using Vagrant instead of "machine"...
I would be happy to contribute to a replacement for docker-machine / podman-machine, but won't really drive. It is clear that neither of the upstream companies wants anything to do with the open source projects anymore.
Ah, yeah. I suppose that while those could fix some issues they aren't really bug fixes themselves.
It's pretty frustrating. Gitlab just had a docker-machine release a week ago, after this problem had been reported and 20.10.1 released, and it's still not working.
Hi all! I read everything, but not sure if a workaround exists at this point. Trying to create a droplet on DigitalOcean, but no joy. Could you confirm, and if a workaround exists - make it clear please? This would really help, because I imagine that a lot of people would find this thread soon
@XedinUnknown read again, a workaround is literally in the first post
It appears to be talking about an arg, but I don't know what this is an arg for.
docker-machine, the very thing this issue is about
I confirm that this command works:
docker-machine create --driver digitalocean --digitalocean-image ubuntu-20-04-x64 --digitalocean-access-token $DOTOKEN --digitalocean-region fra1 --engine-install-url "https://releases.rancher.com/install-docker/19.03.9.sh" docker-ubuntu-2004
Thanks for flagging this issue and raising a quick fix using the --engine-install-url flag. We're using this as the final straw and migrating off docker-machine ASAP, it's just causing us too much of a maintenance headache.
What's interesting is that I only run into this issue when I use an AMI that already has Docker v20.10.2 installed. If I let docker-machine install Docker, it installs v20.10.2 and seems to work fine though.