Docker container not responding to anything
- [x] This is a bug report
- [ ] This is a feature request
- [ ] I searched existing issues before opening this one
Expected behavior
Docker container should run smoothly.
Actual behavior
Sometimes docker containers in the system not responds to anything and if we try to access it will always show up and running.
Steps to reproduce the behavior
- Create a custom image with node:10.11.0 as a base image and any sample node project which listens on port 4000.
#Step 1.
FROM node:10.11.0
#Step 2
LABEL version="1.0"
#Step 3.
RUN mkdir -p /usr/src/core-api
WORKDIR /usr/src/core-api
#Step 4.
COPY package*.json ./
COPY tsconfig*.json ./
COPY tslint*.json ./
#Step 6..
RUN npm install pm2 -g
#Step 7
RUN cd /usr/src/core-api && npm install --production
#Step 8
COPY ./dist ./dist
#Step 9.
EXPOSE 4000
#Step 10.
CMD npm run start-docker
-
Push the custom image image to the repository.
-
Run the image with docker compose.
version: '2'
services:
core:
image: libsynadmin/libsynmp:core-api-staging-0.6.0.1
restart: always
volumes:
- /home/rspurohit/core-api/public/podcast-images:/usr/src/core-api/dist/libsyn-mp.core/src/public/podcast-images
- /home/rspurohit/core-api/public/campaign-documents:/usr/src/core-api/dist/libsyn-mp.core/src/public/campaign-documents
- /home/rspurohit/core-api/public/network-images:/usr/src/core-api/dist/libsyn-mp.core/src/public/network-images
- /home/rspurohit/core-api/public/user-profile-images:/usr/src/core-api/dist/libsyn-mp.core/src/public/user-profile-images
- /home/rspurohit/core-api/public/smart-proposals:/usr/src/core-api/dist/libsyn-mp.core/src/public/smart-proposals
- /home/rspurohit/core-api/public/insertion-orders:/usr/src/core-api/dist/libsyn-mp.core/src/public/insertion-orders
- /home/rspurohit/core-api/public/invoice:/usr/src/core-api/dist/libsyn-mp.core/src/public/invoice
- /home/rspurohit/core-api/public/payslip:/usr/src/core-api/dist/libsyn-mp.core/src/public/payslip
ports:
- '4000:4000'
- '3000:3000'
network_mode: bridge
environment:
- NODE_ENV=development-local
-
Container will be up and running but sometimes it stops reponding and we don't get any output from if we hit the url http://loalhost:4000. Eventhough container and docker are up and running.
-
There are 5 containers running in my system with same above configuration but different services and all of them stops responding at the same time.
-
Once i restart the docker service everything starts working noraml.
Output of docker version:
Client:
Version: 18.03.1-ce
API version: 1.37
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:17:20 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm
Server:
Engine:
Version: 18.03.1-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:15:30 2018
OS/Arch: linux/amd64
Experimental: false
Output of docker info:
Containers: 6
Running: 5
Paused: 0
Stopped: 1
Images: 13
Server Version: 18.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 773c489c9c1b21a6d78b5c538cd395416ec50f88
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-128-generic
Operating System: Ubuntu 16.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.702GiB
Name: lsyn-bgqa
ID: RFOD:4VQN:N4DY:IBTV:FADZ:V6CW:6ZMB:FJL2:365D:43FI:WKNP:MI5W
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: libsynadmin
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.)
It is a VM created with below details.
NAME="Ubuntu" VERSION="16.04.4 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.4 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial
The same happens regularly with different containers on my machine for Docker version 18.09.2 and Ubuntu 18.04. The only solution is to stop docker service and then start it back.
We experienced the same: rare events (like once in 2 months of running) for one of containers in a Swarm on RHEL 7. Docker is CE v20.10.6. The symptoms are:
- The container stops responding on any of the ports.
- Restart of a container or the Swarm doesn't help.
- Rebuild of the image doesn't help.
- Restart of the Docker service solves the issue.
This is quite scary.
The issue still exists. Restarting anything doesn't help. Did anyone found a solution for this?
Same issue coming to me Restarting docker container solves the issue but it is scary
We notice this with Docker version 20.10.14, build a224086 on Ubuntu 18.04.6 LTS.
Also on Docker version 20.10.8, build 3967b7d on Ubuntu 18.04.3 LTS Same symptoms mentioned by @DKroot
@GC-Elia Could you paste the output of docker info and dump a stack trace log file (see the procedure here) and attach it here?
@akerouanton docker info was unresponsive, just like any other docker command except docker ps
I will update with the stack trace next time it happens (already restarted the service)
did you guys figured out, what would be the issue ? do you know a troubleshooting guide for docker containers ? In my case, the process is running, even show some CPU usage, but the docker container is not responding to any command for that particullary container. The other containers are accessible.
I have the same problem with Docker version 20.10.17, build 100c701 Linux xxx 5.15.0-41-generic #44~20.04.1-Ubuntu SMP Fri Jun 24 13:27:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Can a deadlock within the container cause this? My container is completely unresponsive. No stop, kill, inspect, logs.
Hello, we're also seeing this in an Openstack environment under a kolla-ansible deployment with docker 20.10.18 under CentOS 7 and 5.4.211-1.el7.elrepo.x86_64 kernel.
hanging container is a OVS vswitchd container
1e6ef5c89390 kolla/centos-source-openvswitch-vswitchd:train "dumb-init --single-…" 6 months ago Up 6 months openvswitch_vswitchd
docker ps and docker info works, but stats restart and exec etc. do not. There are no mentions in dmesg or journalctl of anything regarding to do with this hang for docker or the hanging container.
The processes are still working and we will hang back on dumping a trace until we manage to empty the node of instances at which point we'll dump a stack trace and attach that as well.
Attached docker info and stack trace dump. docker-info.txt goroutine-stacks-2023-05-25T084842Z.log
After trying to first reload and restart the docker service it hung on the service restart and we had new logs from containerd
May 25 08:56:09 compute16 dockerd: time="2023-05-25T08:56:09.666961806Z" level=info msg="Container failed to exit within 10s of signal 15 - using the force" container=1e6ef5c89390002ae65864a21a5dc0fe60341a352b5921ca40921a85ca23ea93
May 25 08:56:11 compute16 containerd: time="2023-05-25T08:56:11.674275056Z" level=error msg="get state for 1e6ef5c89390002ae65864a21a5dc0fe60341a352b5921ca40921a85ca23ea93" error="context deadline exceeded: unknown"
May 25 08:56:11 compute16 containerd: time="2023-05-25T08:56:11.676327116Z" level=warning msg="unknown status" status=0
only after killing the docker pid could we get the container and docker service fully responding again.
Same experience with Docker 23.0.1:
Client: Docker Engine - Community
Version: 23.0.1
API version: 1.42
Go version: go1.19.5
Git commit: a5ee5b1
Built: Thu Feb 9 19:51:00 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 23.0.1
API version: 1.42 (minimum version 1.12)
Go version: go1.19.5
Git commit: bc3805a
Built: Thu Feb 9 19:48:42 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.16
GitCommit: 31aa4358a36870b21a992d3ad2bef29e1d693bec
runc:
Version: 1.1.4
GitCommit: v1.1.4-0-g5fd4c4d
docker-init:
Version: 0.19.0
GitCommit: de40ad0
System:
Linux ... 5.4.17-2136.315.5.el7uek.x86_64 #2 SMP Wed Dec 21 19:57:57 PST 2022 x86_64 x86_64 x86_64 GNU/Linux
Some people reported here, that restarting docker helped - was not the case for me.
I had to restart the whole machine and after restart the container was still there with status Dead.
The inspection of the dead container shows the following:
"State": {
"Status": "dead",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": true,
"Pid": 0,
"ExitCode": 255,
"Error": "",
"StartedAt": "2023-07-18T06:52:21.535364308Z",
"FinishedAt": "2023-07-31T11:19:59.586319264+02:00",
"Health": {
"Status": "unhealthy",
"FailingStreak": 71,
"Log": [
{
"Start": "2023-07-31T11:10:12.685367803+02:00",
"End": "2023-07-31T11:10:53.982444816+02:00",
"ExitCode": -1,
"Output": "timed out starting health check for container 12dfd695fa58e71ebda2ea12b832a616bb10dbe84aa1e4e5fd72bbb6359833d3"
},
{
"Start": "2023-07-31T11:13:25.124506079+02:00",
"End": "2023-07-31T11:13:50.756552511+02:00",
"ExitCode": -1,
"Output": "cannot exec in a stopped state: unknown"
},
{
"Start": "2023-07-31T11:14:20.915017447+02:00",
"End": "2023-07-31T11:14:50.756872688+02:00",
"ExitCode": -1,
"Output": "cannot exec in a stopped state: unknown"
},
{
"Start": "2023-07-31T11:15:33.030246325+02:00",
"End": "2023-07-31T11:15:50.758445901+02:00",
"ExitCode": -1,
"Output": "cannot exec in a stopped state: unknown"
},
{
"Start": "2023-07-31T11:16:20.799750496+02:00",
"End": "2023-07-31T11:16:50.800662276+02:00",
"ExitCode": -1,
"Output": "timed out starting health check for container 12dfd695fa58e71ebda2ea12b832a616bb10dbe84aa1e4e5fd72bbb6359833d3"
}
]
}
}
Note the health log entries with output "cannot exec in a stopped state: unknown": the container was not stopped, its state was running (while unhealthy).
Maybe, this could help?
The dead container can be removed like any other with docker rm.