Genesis icon indicating copy to clipboard operation
Genesis copied to clipboard

Running in Docker container

Open v-snap opened this issue 1 year ago • 6 comments

import os
os.environ['PYOPENGL_PLATFORM'] = 'glx'
import genesis as gs
gs.init(backend=gs.cuda)

scene = gs.Scene(show_viewer=True)
plane = scene.add_entity(gs.morphs.Plane())
franka = scene.add_entity(
    gs.morphs.MJCF(file='xml/franka_emika_panda/panda.xml'),
)

scene.build()

for i in range(1000):
    scene.step()

Running above inside cuda supported container its fails to load gpu.cuda as backend with following error.

ERROR

[Genesis] [22:10:25] [INFO] ╭─────────────────────────────────────────────────────────────────────────────────────╮
[Genesis] [22:10:25] [INFO] │┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉ Genesis ┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉│
[Genesis] [22:10:25] [INFO] ╰─────────────────────────────────────────────────────────────────────────────────────╯
[Genesis] [22:10:27] [INFO] Running on [NVIDIA RTX 2000 Ada Generation Laptop GPU] with backend gs.cuda. Device memory: 7.75 GB.
[W 12/20/24 22:10:27.527 958] [cuda_driver.cpp:CUDADriver@135] The Taichi CUDA backend requires at least CUDA 10.0, got v0.0.
[W 12/20/24 22:10:27.528 958] [misc.py:adaptive_arch_select@758] Arch=[<Arch.cuda: 3>] is not supported, falling back to CPU
[Genesis] [22:10:27] [INFO] 🚀 Genesis initialized. 🔖 version: 0.2.0, 🌱 seed: None, 📏 precision: '32', 🐛 debug: False, 🎨 theme: 'dark'.
[Genesis] [22:10:28] [INFO] Scene <9fd6fd6> created.
[Genesis] [22:10:28] [INFO] Adding <gs.RigidEntity>. idx: 0, uid: <20b0275>, morph: <gs.morphs.Plane>, material: <gs.materials.Rigid>.
[Genesis] [22:10:28] [INFO] Adding <gs.RigidEntity>. idx: 1, uid: <a554545>, morph: <gs.morphs.MJCF(file='/root/.pyenv/versions/paz/lib/python3.10/site-packages/genesis/assets/xml/franka_emika_panda/panda.xml')>, material: <gs.materials.Rigid>.

I guess taichi is not able detect , thus falling back to cpu

[W 12/20/24 22:10:27.527 958] [cuda_driver.cpp:CUDADriver@135] The Taichi CUDA backend requires at least CUDA 10.0, got v0.0.
[W 12/20/24 22:10:27.528 958] [misc.py:adaptive_arch_select@758] Arch=[<Arch.cuda: 3>] is not supported, falling back to CPU

Here output of nvidia-smi inside container

Fri Dec 20 22:11:42 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX 2000 Ada Gene...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   47C    P3            588W /   35W |       9MiB /   8188MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0

v-snap avatar Dec 20 '24 16:12 v-snap

What if you try this:

print(torch.cuda.is_available())

If it's not available, maybe try to set your cuda path manually?

export CUDA_HOME=/usr/local/cuda-11.7 # change it to your cuda path
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

wangyian-me avatar Dec 20 '24 16:12 wangyian-me

I have checked everything and cuda exist and path are also correct still its throws Error

print(torch.cuda.is_available())

>>> import torch
>>> print(torch.cuda.is_available())
True
>>> 
export CUDA_HOME=/usr/local/cuda-11.7 # change it to your cuda path
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

v-snap avatar Dec 20 '24 16:12 v-snap

My guess is it could be because the pip install taichi by default has TI_WITH_CUDA_TOOLKIT OFF

taichi installation instruction

TI_WITH_CUDA	Build with the CUDA backend	ON
TI_WITH_CUDA_TOOLKIT	Build with the CUDA toolkit	OFF

It might be because of this it works in host environment but not inside docker container

v-snap avatar Dec 20 '24 17:12 v-snap

https://github.com/ETSTribology/Genesis

My docker file not ready yet

antoinebou12 avatar Dec 21 '24 03:12 antoinebou12

@antoinebou12 let us know if you get it working inside container, once you have complete dockerfile.

If possible can just check ti diagnose or python -m taichi diagnose command inside container and confirm if the outputs shows cuda : True or False

ti diagnose output even though i am running inside cuda compatible container.

cpu: True
metal: False
opengl: True
[W 12/21/24 16:05:49.213 15121] [cuda_driver.cpp:CUDADriver@135] The Taichi CUDA backend requires at least CUDA 10.0, got v0.0.
cuda: False
vulkan: True

v-snap avatar Dec 21 '24 10:12 v-snap

Saw something on twitter, maybe it will help? https://github.com/MizuhoAOKI/genesis_docker

wangyian-me avatar Dec 22 '24 05:12 wangyian-me