Cannot connect to remote Docker daemon during upgrade
Description
I'm having a strange issue where I'm able to work with a remote swarm manager through regular docker CLI commands, but docker app is not able to communicate with the remote daemon for an app upgrade.
Steps to reproduce the issue:
▸ streaming-platform git:(master) docker context use sa-kafka-preprod
sa-kafka-preprod
Current context is now "sa-kafka-preprod"
▸ streaming-platform git:(master) docker context ls
NAME DESCRIPTION DOCKER ENDPOINT KUBERNETES ENDPOINT ORCHESTRATOR
ba-gen-int ba-gen-int Cluster tcp://ba-gen-int-api-40d0dafa3a0f133a.elb.us-west-2.amazonaws.com:2376 swarm
default Current DOCKER_HOST based configuration unix:///var/run/docker.sock https://localhost:6443 (default) swarm
sa-cat-trainer Spend Analytics Categorization Trainer tcp://sa-cat-trainer-api-85e9489d200b894f.elb.us-west-2.amazonaws.com:2376 swarm
sa-kafka-dev Spend Analytics Kafka Development tcp://sa-kafka-dev-api-4e7c8c601fcf8bc0.elb.us-west-2.amazonaws.com:2376 swarm
sa-kafka-preprod * Spend Analytics Kafka Pre-Prod tcp://sa-kafka-preprod-api-8615921daaa38546.elb.us-west-2.amazonaws.com:2376 swarm
sa-utils Spend Analytics Utilities tcp://sa-utils-api-9233a9e91144a8d5.elb.us-west-2.amazonaws.com:2376 swarm
▸ streaming-platform git:(master) docker stack ls
NAME SERVICES ORCHESTRATOR
host-services 3 Swarm
processing 1 Swarm
streaming 3 Swarm
▸ streaming-platform git:(master) docker app upgrade streaming --app-name streaming --parameters-file parameters/preprod.yml --with-registry-auth
Upgrade failed: Action "upgrade" failed: Cannot connect to the Docker daemon at tcp://sa-kafka-preprod-api-8615921daaa38546.elb.us-west-2.amazonaws.com:2376. Is the docker daemon running?
Describe the results you received:
docker app reports that it is unable to connect to the remote Docker daemon.
Describe the results you expected:
docker app should upgrade the app as commanded.
Additional information you deem important (e.g. issue happens only occasionally):
First time this has happened to me.
Output of docker version:
Client: Docker Engine - Community
Version: 19.03.0-rc2
API version: 1.39 (downgraded from 1.40)
Go version: go1.12.5
Git commit: f97efcc
Built: Wed Jun 5 01:37:53 2019
OS/Arch: darwin/amd64
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 18.09.6
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 481bc77
Built: Sat May 4 02:02:43 2019
OS/Arch: linux/amd64
Experimental: false
Output of docker-app version:
Version: v0.8.0-rc1
Git commit: 73d86744
Built: Wed Jun 5 01:45:27 2019
OS/Arch: darwin/amd64
Experimental: off
Renderers: none
Invocation Base Image: docker/cnab-app-base:v0.8.0-rc1
Output of docker info:
Client:
Debug Mode: false
Plugins:
app: Docker Application (Docker Inc., v0.8.0-rc1)
buildx: Build with BuildKit (Docker Inc., v0.2.2-tp-docker)
Server:
Containers: 2
Running: 1
Paused: 0
Stopped: 1
Images: 4
Server Version: 18.09.6
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: local
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: 43vx9cihpn6a6zktuc7gk23ht
Is Manager: true
ClusterID: jozp5n2jolwr3qogl031ynwgg
Managers: 3
Nodes: 7
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 10.57.11.89
Manager Addresses:
10.57.11.89:2377
10.57.21.108:2377
10.57.31.114:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.14.123-111.109.amzn2.x86_64
Operating System: Amazon Linux 2
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 961.5MiB
Name: ip-10-57-11-89.transzap.com
ID: KVCQ:MTBJ:6XRJ:DAAF:SCKI:H43U:WERV:FDDU:4EJ3:PRCU:XIT4:VREE
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
Additional environment details (AWS, VirtualBox, physical, etc.):
The remote cluster is on AWS. Local client is Docker for Mac.
Hello @kinghuang thank you for filling this pretty weird issue 👍 We will try to reproduce it for investigation, it might take time.
Hi @kinghuang, We couldn't reproduce this issue. Can you provide more information about the setup? If you get this issue again, could you try to connect manually to the VM and check if the docker daemon is running?
@aiordache Thanks for looking into it. I believe the behaviour I'm seeing is caused by VPN. I hop between multiple VPN tunnels to get access to all my clusters. I'll try to come up with a reproducible case.