nerdctl Unable to Restart the Container via the `nerdctl restart` comamnd

Description

As per the document, the nerdctl restart should let me restart one or more containers running on the system. However, when I try to do that with the containerd based infra, it runs into a fatal error.

root@tardis-169-254-6-46:/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/k8s.io# nerdctl ps | grep metrics-server
3eb77b051f92    myregistry.gallifrey.com/gallifrey-docker/pause:3.6                                                                              "/pause"                  2 days ago        Up                 k8s://kube-system/metrics-server-6c8bbfdb57-gz6sx
ff39a309f6d8    myregistry.gallifrey.com/gallifrey-docker/metrics-server:2.0.3                                                               "/metrics-server --r…"    9 seconds ago     Up                     k8s://kube-system/metrics-server-6c8bbfdb57-gz6sx/metrics-server


root@tardis-169-254-6-46:/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/k8s.io# nerdctl restart ff39a309f6d8
FATA[0010] failed to start shim: mkdir /var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/k8s.io/ff39a309f6d85bf50e98ca59600739b6bfbac4452d8f7cbca57f92e56d26dff5: file exists: unknown

Steps to reproduce the issue

Create Kubernetes cluster with Containerd based CRI
Use nerdctl restart to restart any non pause container of the pod in the node

Below is the config of the containerd Daemon


version=2
root = "/data/docker/containerd/daemon"
state = "/var/run/docker/containerd/daemon"
oom_score = -500
required_plugins = ["io.containerd.grpc.v1.cri"]

[grpc]
  uid = 0
  gid = 0
  max_recv_message_size = 16777216
  max_send_message_size = 16777216

[debug]
  uid = 0
  gid = 0
  level = "info"

[metrics]
  address = "localhost:1414"
  grpc_histogram = true

[cgroup]
  path = ""

[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    stream_server_address = "127.0.0.1"
    max_container_log_line_size = 262144
    sandbox_image = "myregistry.gallifrey.com/gallifrey-docker/pause:3.6"
  [plugins."io.containerd.grpc.v1.cri".cni]
    bin_dir = "/opt/cni/bin"
    conf_dir = "/etc/cni/net.d"
    conf_template = "/etc/cni/net.d/10-calico.conflist"
  [plugins."io.containerd.grpc.v1.cri".containerd]
    default_runtime_name = "runc"
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
    runtime_type = "io.containerd.runc.v2"

Describe the results you received and expected

I want the container to be restart without a Fatal Error.

root@tardis-169-254-6-46:/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/k8s.io# nerdctl restart ff39a309f6d8
FATA[0010] failed to start shim: mkdir /var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/k8s.io/ff39a309f6d85bf50e98ca59600739b6bfbac4452d8f7cbca57f92e56d26dff5: file exists: unknown

What version of nerdctl are you using?

WARN[0000] unable to determine buildctl version: exec: "buildctl": executable file not found in $PATH
Client:
 Version:	v0.22.0
 OS/Arch:	linux/amd64
 Git commit:	8e278e2aa61a89d4e50d1a534217f264bd1a5ddf
 builctl:
  Version:

Server:
 containerd:
  Version:	1.6.6
  GitCommit:	10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:	1.1.2
  GitCommit:	v1.1.2-0-ga916309

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

NO

Host information

Client:
 Namespace:	k8s.io
 Debug Mode:	false

Server:
 Server Version: 1.6.6
 Storage Driver: overlayfs
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Log: fluentd journald json-file
  Storage: aufs native overlayfs
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.0-122-generic
 Operating System: Ubuntu 18.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 19.55GiB
 Name: tardis-169-254-6-46
 ID: 99bbe607-8a58-4737-8f7f-86b82d853867

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Jul 29 '22 08:07 harshanarayana

Any updates? I got the same error.

Sep 15 '22 07:09 TonyKingdom

@harshanarayana @TonyKingdom
Does this only happen inside a k8s node? and for all running containers?
The error seems to me is more on the containerd/shim side, not nerdctl.

If you run a container directly from your host, will that be able to restart? e.g.
$ nerdctl run -d --name nginx -p 80:80 nginx:alpine $ nerdctl restart nginx

Dec 09 '22 05:12 fangn2

I encounter the same issue. nerdctl can restart the container created by itself But restart the container in k8s.io namespace will got the error I suspect kubelet and nerdctl restarting the container at the same time cause the issue

Mar 02 '23 03:03 jiayu1016

I have the same issue. Try to restart a nginx container which created by nerdctl compose up. I got the same error message:

FATA[0000] 1 errors:
failed to start shim: mkdir /run/containerd/io.containerd.runtime.v2.task/default/5346acf36593b69abd4ec7e0f48eb9095e4dd6ba0d637bd3d21053269dfff051: file exists: unknown

I only installed the nerdctl full package. OS: CentOS 7.6 Version:

Client:
 Version:       v1.5.0
 OS/Arch:       linux/amd64
 Git commit:    b33a58f288bc42351404a016e694190b897cd252
 buildctl:
  Version:      v0.12.0
  GitCommit:    18fc875d9bfd6e065cd8211abc639434ba65aa56

Server:
 containerd:
  Version:      v1.7.3
  GitCommit:    7880925980b188f4c97b462f709d0db8e8962aff
 runc:
  Version:      1.1.8
  GitCommit:    v1.1.8-0-g82f18fe0

Sep 07 '23 01:09 Leeyon