nvidia-docker
nvidia-docker copied to clipboard
nvml error: insufficient permissions
1. Issue or feature description
Running nvidia-smi in nvidia-docker raises an error.
2. Steps to reproduce the issue
❯ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: nvml error: insufficient permissions: unknown.
3. Information to attach (optional if deemed irrelevant)
- [x] Some nvidia-container information:
nvidia-container-cli -k -d /dev/tty info: https://gist.github.com/ethanabrooks/bc2d9cc3c3fb61e0b188b18bcb2fe15e - [x] Kernel version from
uname -a: Linux rldl8 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux - [x] Any relevant kernel output lines from
dmesg: https://gist.github.com/ethanabrooks/a2b1d485499fb724e32a89e8f3ed218a - [x] Driver information from
nvidia-smi -a:
Wed Jul 21 19:00:09 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:02:00.0 Off | N/A |
| 29% 16C P8 10W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:03:00.0 Off | N/A |
| 29% 14C P8 7W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... On | 00000000:82:00.0 Off | N/A |
| 29% 15C P8 7W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... On | 00000000:83:00.0 Off | N/A |
| 29% 15C P8 8W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
- [x] Docker version from
docker version:
Client: Docker Engine - Community
Version: 20.10.6
API version: 1.41
Go version: go1.13.15
Git commit: 370c289
Built: Fri Apr 9 22:46:01 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.6
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: 8728dd2
Built: Fri Apr 9 22:44:13 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.4
GitCommit: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
runc:
Version: 1.0.0-rc93
GitCommit: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
docker-init:
Version: 0.19.0
GitCommit: de40ad0
- [x] NVIDIA packages version from
dpkg -l '*nvidia*'orrpm -qa '*nvidia*':
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-======================================-========================-========================-==================================================================================
un libgldispatch0-nvidia <none> <none> (no description available)
ii libnvidia-cfg1-470:amd64 470.57.02-0ubuntu1 amd64 NVIDIA binary OpenGL/GLX configuration library
un libnvidia-cfg1-any <none> <none> (no description available)
un libnvidia-common <none> <none> (no description available)
ii libnvidia-common-470 470.57.02-0ubuntu1 all Shared files used by the NVIDIA libraries
ii libnvidia-compute-470:amd64 470.57.02-0ubuntu1 amd64 NVIDIA libcompute package
ii libnvidia-container-tools 1.4.0-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.4.0-1 amd64 NVIDIA container runtime library
un libnvidia-decode <none> <none> (no description available)
ii libnvidia-decode-470:amd64 470.57.02-0ubuntu1 amd64 NVIDIA Video Decoding runtime libraries
un libnvidia-encode <none> <none> (no description available)
ii libnvidia-encode-470:amd64 470.57.02-0ubuntu1 amd64 NVENC Video Encoding runtime library
un libnvidia-extra <none> <none> (no description available)
ii libnvidia-extra-470:amd64 470.57.02-0ubuntu1 amd64 Extra libraries for the NVIDIA driver
un libnvidia-fbc1 <none> <none> (no description available)
ii libnvidia-fbc1-470:amd64 470.57.02-0ubuntu1 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
un libnvidia-gl <none> <none> (no description available)
ii libnvidia-gl-470:amd64 470.57.02-0ubuntu1 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un libnvidia-ifr1 <none> <none> (no description available)
ii libnvidia-ifr1-470:amd64 470.57.02-0ubuntu1 amd64 NVIDIA OpenGL-based Inband Frame Readback runtime library
un libnvidia-ml1 <none> <none> (no description available)
un nvidia-304 <none> <none> (no description available)
un nvidia-340 <none> <none> (no description available)
un nvidia-384 <none> <none> (no description available)
un nvidia-390 <none> <none> (no description available)
un nvidia-common <none> <none> (no description available)
ii nvidia-compute-utils-470 470.57.02-0ubuntu1 amd64 NVIDIA compute utilities
un nvidia-container-runtime <none> <none> (no description available)
un nvidia-container-runtime-hook <none> <none> (no description available)
ii nvidia-container-toolkit 1.5.1-1 amd64 NVIDIA container runtime hook
ii nvidia-dkms-470 470.57.02-0ubuntu1 amd64 NVIDIA DKMS package
un nvidia-dkms-kernel <none> <none> (no description available)
ii nvidia-driver-470 470.57.02-0ubuntu1 amd64 NVIDIA driver metapackage
un nvidia-driver-binary <none> <none> (no description available)
un nvidia-kernel-common <none> <none> (no description available)
ii nvidia-kernel-common-470 470.57.02-0ubuntu1 amd64 Shared files used with the kernel module
un nvidia-kernel-source <none> <none> (no description available)
ii nvidia-kernel-source-470 470.57.02-0ubuntu1 amd64 NVIDIA kernel source package
un nvidia-legacy-340xx-vdpau-driver <none> <none> (no description available)
un nvidia-libopencl1-dev <none> <none> (no description available)
ii nvidia-modprobe 470.57.02-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
un nvidia-opencl-icd <none> <none> (no description available)
un nvidia-persistenced <none> <none> (no description available)
ii nvidia-prime 0.8.16~0.18.04.1 all Tools to enable NVIDIA's Prime
ii nvidia-settings 470.57.02-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
un nvidia-settings-binary <none> <none> (no description available)
un nvidia-smi <none> <none> (no description available)
un nvidia-utils <none> <none> (no description available)
ii nvidia-utils-470 470.57.02-0ubuntu1 amd64 NVIDIA driver support binaries
un nvidia-vdpau-driver <none> <none> (no description available)
ii xserver-xorg-video-nvidia-470 470.57.02-0ubuntu1 amd64 NVIDIA binary Xorg driver
- [x] NVIDIA container library version from
nvidia-container-cli -V:
version: 1.4.0
build date: 2021-04-24T14:25+00:00
build revision: 704a698b7a0ceec07a48e56c37365c741718c2df
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
- [x] NVIDIA container library logs (see troubleshooting)
I0721 22:54:19.158082 3729 nvc.c:372] initializing library context (version=1.4.0, build=704a698b7a0ceec07a48e56c37365c741718c2df)
I0721 22:54:19.158192 3729 nvc.c:346] using root /
I0721 22:54:19.158207 3729 nvc.c:347] using ldcache /etc/ld.so.cache
I0721 22:54:19.158220 3729 nvc.c:348] using unprivileged user 65534:65534
I0721 22:54:19.158255 3729 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0721 22:54:19.158488 3729 nvc.c:391] dxcore initialization failed, continuing assuming a non-WSL environment
I0721 22:54:19.164111 3736 nvc.c:274] loading kernel module nvidia
I0721 22:54:19.164436 3736 nvc.c:278] running mknod for /dev/nvidiactl
I0721 22:54:19.164516 3736 nvc.c:282] running mknod for /dev/nvidia0
I0721 22:54:19.164572 3736 nvc.c:282] running mknod for /dev/nvidia1
I0721 22:54:19.164624 3736 nvc.c:282] running mknod for /dev/nvidia2
I0721 22:54:19.164675 3736 nvc.c:282] running mknod for /dev/nvidia3
I0721 22:54:19.164727 3736 nvc.c:286] running mknod for all nvcaps in /dev/nvidia-caps
I0721 22:54:19.176785 3736 nvc.c:214] running mknod for /dev/nvidia-caps/nvidia-cap1 from /proc/driver/nvidia/capabilities/mig/config
I0721 22:54:19.177158 3736 nvc.c:214] running mknod for /dev/nvidia-caps/nvidia-cap2 from /proc/driver/nvidia/capabilities/mig/monitor
I0721 22:54:19.184903 3736 nvc.c:292] loading kernel module nvidia_uvm
I0721 22:54:19.184969 3736 nvc.c:296] running mknod for /dev/nvidia-uvm
I0721 22:54:19.185072 3736 nvc.c:301] loading kernel module nvidia_modeset
I0721 22:54:19.185123 3736 nvc.c:305] running mknod for /dev/nvidia-modeset
I0721 22:54:19.185489 3737 driver.c:101] starting driver service
I0721 22:54:19.190228 3729 driver.c:203] driver service terminated with signal 15
-- WARNING, the following logs are for debugging purposes only --
I0721 22:55:58.102241 4055 nvc.c:372] initializing library context (version=1.4.0, build=704a698b7a0ceec07a48e56c37365c741718c2df)
I0721 22:55:58.102351 4055 nvc.c:346] using root /
I0721 22:55:58.102370 4055 nvc.c:347] using ldcache /etc/ld.so.cache
I0721 22:55:58.102386 4055 nvc.c:348] using unprivileged user 65534:65534
I0721 22:55:58.102428 4055 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0721 22:55:58.102674 4055 nvc.c:391] dxcore initialization failed, continuing assuming a non-WSL environment
I0721 22:55:58.107939 4063 nvc.c:274] loading kernel module nvidia
I0721 22:55:58.108213 4063 nvc.c:278] running mknod for /dev/nvidiactl
I0721 22:55:58.108301 4063 nvc.c:282] running mknod for /dev/nvidia0
I0721 22:55:58.108361 4063 nvc.c:282] running mknod for /dev/nvidia1
I0721 22:55:58.108418 4063 nvc.c:282] running mknod for /dev/nvidia2
I0721 22:55:58.108475 4063 nvc.c:282] running mknod for /dev/nvidia3
I0721 22:55:58.108530 4063 nvc.c:286] running mknod for all nvcaps in /dev/nvidia-caps
I0721 22:55:58.120627 4063 nvc.c:214] running mknod for /dev/nvidia-caps/nvidia-cap1 from /proc/driver/nvidia/capabilities/mig/config
I0721 22:55:58.120952 4063 nvc.c:214] running mknod for /dev/nvidia-caps/nvidia-cap2 from /proc/driver/nvidia/capabilities/mig/monitor
I0721 22:55:58.129024 4063 nvc.c:292] loading kernel module nvidia_uvm
I0721 22:55:58.129091 4063 nvc.c:296] running mknod for /dev/nvidia-uvm
I0721 22:55:58.129212 4063 nvc.c:301] loading kernel module nvidia_modeset
I0721 22:55:58.129266 4063 nvc.c:305] running mknod for /dev/nvidia-modeset
I0721 22:55:58.129659 4064 driver.c:101] starting driver service
I0721 22:55:58.134859 4055 driver.c:203] driver service terminated with signal 15
#1547 same issue with you
Solved it #1547