for-linux icon indicating copy to clipboard operation
for-linux copied to clipboard

Unable to see gpu processes pid inside the container

Open aditya-sanas opened this issue 10 months ago • 1 comments

When I run any gpu process inside my docker container, I see that GPU is getting utilised but the pids are not visible in the output of nvidia-smi

Steps to reproduce the issue

  1. docker run -it --rm --gpus all nvidia/cuda:12.8.0-base-ubuntu22.04 bash
  2. Run any process utlising CUDA
  3. watch nvidia-smi

Describe the results you received

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:41:00.0 Off |                  Off |
| 32%   41C    P2              63W / 450W |   7325MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 4090        Off | 00000000:82:00.0 Off |                  Off |
| 32%   41C    P2              67W / 450W |   7555MiB / 24564MiB |    100%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

Describe the results you expected

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:41:00.0 Off |                  Off |
| 32%   41C    P2              63W / 450W |   7325MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 4090        Off | 00000000:82:00.0 Off |                  Off |
| 32%   41C    P2              67W / 450W |   7555MiB / 24564MiB |    100%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A    149767      C   python                                     7318MiB |
|    1   N/A  N/A    150809      C   ...ubuntu/translators/.venv/bin/python     7548MiB |
+---------------------------------------------------------------------------------------+

Environment:

  • OS: Ubuntu 22.04.3 LTS
  • NVIDIA Container Toolkit: NVIDIA Container Runtime Hook version 1.17.5
  • Host can detect and use the GPU correctly

aditya-sanas avatar Apr 02 '25 03:04 aditya-sanas