Genesis icon indicating copy to clipboard operation
Genesis copied to clipboard

100% CPU load when using show_viewer True on Docker

Open Nulliik opened this issue 1 year ago • 8 comments

Environment:

  • OS: Windows, using Docker
  • GPU RTX 4090

I am trying to setup Genesis working under docker and render the visualization results to Widows X server. Currently everything works, but viewer is rendering scene only using the CPU, I believe it is bound to pyrender and pyopengl. My CPU usage is 100%, but as I turn shadows off it becomes lower and keeps at ~70%.

Setting os.environ["SDL_VIDEO_X11_FORCE_EGL"] = "1" and os.environ['PYOPENGL_PLATFORM'] = 'egl' results in

Exception in thread Thread-2 (_init_and_start_app):
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/viewer.py", line 1142, in _init_and_start_app
    pyglet.clock.tick()
  File "/opt/conda/lib/python3.11/site-packages/pyglet/clock.py", line 528, in tick
    return _default.tick(poll)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/pyglet/clock.py", line 270, in tick
    self.call_scheduled_functions(delta_t)
  File "/opt/conda/lib/python3.11/site-packages/pyglet/clock.py", line 217, in call_scheduled_functions
    item.func(now - item.last_ts, *item.args, **item.kwargs)
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/viewer.py", line 936, in _time_event
    self.on_draw()
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/viewer.py", line 635, in on_draw
    self._render()
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/viewer.py", line 1079, in _render
    retval = renderer.render(self.scene, flags, seg_node_map=seg_node_map)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/renderer.py", line 141, in render
    self._update_context(scene, flags)
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/renderer.py", line 899, in _update_context
    p._add_to_context()
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/primitive.py", line 359, in _add_to_context
    glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, FLOAT_SZ * 3, ctypes.c_void_p(0))
  File "/opt/conda/lib/python3.11/site-packages/OpenGL/latebind.py", line 63, in __call__
    return self.wrapperFunction( self.baseFunction, *args, **named )
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/OpenGL/GL/VERSION/GL_2_0.py", line 469, in glVertexAttribPointer
    contextdata.setValue( key, array )
  File "/opt/conda/lib/python3.11/site-packages/OpenGL/contextdata.py", line 58, in setValue
    context = getContext( context )
              ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/OpenGL/contextdata.py", line 40, in getContext
    raise error.Error(
OpenGL.error.Error: Attempt to retrieve context when no valid context

Nulliik avatar Dec 26 '24 22:12 Nulliik

@Kashu7100 this error happens with the docker file you created, so I ask for your help

Nulliik avatar Dec 26 '24 22:12 Nulliik

Also running rendering demo results in crash

/opt/conda/bin/python /workspace/examples/rendering/demo.py
[Genesis] [22:20:16] [INFO] ╭─────────────────────────────────────────────────────────────────────────────────────╮
[Genesis] [22:20:16] [INFO] │┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉ Genesis ┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉│
[Genesis] [22:20:16] [INFO] ╰─────────────────────────────────────────────────────────────────────────────────────╯
[Genesis] [22:20:16] [INFO] Running on [NVIDIA GeForce RTX 4090] with backend gs.cuda. Device memory: 23.99 GB.
[Genesis] [22:20:16] [DEBUG] [Taichi] version 1.7.2, llvm 15.0.4, commit 0131dce9, linux, python 3.11.10
[Genesis] [22:20:16] [DEBUG] [Taichi] Starting on arch=cuda
[Genesis] [22:20:16] [INFO] 🚀 Genesis initialized. 🔖 version: 0.2.0, 🌱 seed: 0, 📏 precision: '32', 🐛 debug: False, 🎨 theme: 'dark'.
[Genesis] [22:20:20] [INFO] Scene <48b5b1f> created.
[Genesis] [22:20:20] [INFO] Adding <gs.RigidEntity>. idx: 0, uid: <0d6b158>, morph: <gs.morphs.Plane>, material: <gs.materials.Rigid>.
[Genesis] [22:20:20] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 0.
[Genesis] [22:20:20] [INFO] Adding <gs.RigidEntity>. idx: 1, uid: <5194d74>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:22] [INFO] Preprocessing geom idx 1.
[Genesis] [22:20:28] [INFO] Adding <gs.RigidEntity>. idx: 2, uid: <b05341d>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:28] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 2.
[Genesis] [22:20:28] [INFO] Adding <gs.RigidEntity>. idx: 3, uid: <58b3211>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:28] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 3.
[Genesis] [22:20:28] [INFO] Adding <gs.RigidEntity>. idx: 4, uid: <df5a9c1>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:28] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 4.
[Genesis] [22:20:28] [INFO] Adding <gs.RigidEntity>. idx: 5, uid: <291a8d3>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:28] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 5.
[Genesis] [22:20:28] [INFO] Adding <gs.RigidEntity>. idx: 6, uid: <598c158>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:28] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 6.
[Genesis] [22:20:28] [INFO] Adding <gs.RigidEntity>. idx: 7, uid: <8f64b87>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:28] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 7.
[Genesis] [22:20:28] [INFO] Adding <gs.RigidEntity>. idx: 8, uid: <1723faa>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/wooden_sphere_OBJ/wooden_sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:30] [INFO] Preprocessing geom idx 8.
[Genesis] [22:20:36] [INFO] Adding <gs.RigidEntity>. idx: 9, uid: <6a43423>, morph: <gs.morphs.Mesh(file='/opt/conda/lib/python3.11/site-packages/genesis/assets/meshes/wooden_sphere_OBJ/wooden_sphere.obj')>, material: <gs.materials.Rigid>.
[Genesis] [22:20:36] [DEBUG] Preprocessed `.gsd` file found in cache for geom idx 9.
[Genesis] [22:20:36] [INFO] Building scene <48b5b1f>...
[Genesis] [22:20:52] [INFO] Compiling simulation kernels...
[Genesis] [22:20:59] [INFO] Building visualizer...
[Genesis] [22:20:59] [INFO] Viewer created. Resolution: 1920×1080, max_FPS: 60.
[Genesis] [22:21:11] [INFO] Resetting Scene <48b5b1f>.
[Genesis] [22:21:11] [INFO] Running at 402.59 FPS.
[2024-12-26 22:21:12.597] [console] [error] Assertion 'handle != nullptr' failed: OptiX library could not be loaded. [/workspace/Genesis/genesis/ext/LuisaRender/src/compute/src/backends/cuda/optix_api.cpp:162]
     0 [0x7f5f5ffe67fe]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-backend-cuda.so :: luisa::compute::optix::load_optix() + 622
     1 [0x7f5f5ffe6f53]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-backend-cuda.so :: luisa::compute::optix::api() + 115
     2 [0x7f5f5ff1c59b]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-backend-cuda.so :: luisa::compute::cuda::CUDADevice::Handle::optix_context() const + 235
     3 [0x7f5f5ff8d615]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-backend-cuda.so :: luisa::compute::cuda::CUDAPrimitive::_build(luisa::compute::cuda::CUDACommandEncoder&) + 165
     4 [0x7f5f5ff9ced6]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-backend-cuda.so :: luisa::compute::cuda::CUDAMesh::build(luisa::compute::cuda::CUDACommandEncoder&, luisa::compute::MeshBuildCommand*) + 374
     5 [0x7f5f5ff0c7a8]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-backend-cuda.so :: luisa::compute::cuda::CUDAStream::dispatch(luisa::compute::CommandList&&) + 200
     6 [0x7f5f5ff1db4e]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-backend-cuda.so :: luisa::compute::cuda::CUDADevice::dispatch(unsigned long, luisa::compute::CommandList&&) + 94
     7 [0x7f5f98179ec4]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/liblc-runtime.so :: luisa::compute::Stream::operator<<(luisa::compute::CommandList::Commit&&) + 52
     8 [0x7f5f9a7a5e08]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/libluisa-render-base.so :: luisa::render::Geometry::_process_shape(luisa::render::CommandBuffer&, float, luisa::render::Shape const*, luisa::render::Surface const*, luisa::render::Light const*, luisa::render::Medium const*, luisa::render::Subsurface const*, bool, unsigned long) + 1800
     9 [0x7f5f9a7a7fa2]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/libluisa-render-base.so :: luisa::render::Geometry::update(luisa::render::CommandBuffer&, ankerl::unordered_dense::v2_0_2::detail::table<luisa::render::Shape*, void, luisa::hash<luisa::render::Shape*>, std::equal_to<void>, luisa::allocator<luisa::render::Shape*>, ankerl::unordered_dense::v2_0_2::bucket_type::standard, eastl::vector<luisa::render::Shape*, eastl::allocator> > const&, float) + 354
    10 [0x7f5f9a76fcc8]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/libluisa-render-base.so :: luisa::render::Pipeline::update(luisa::compute::Stream&) + 792
    11 [0x7f5f9bcbef76]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/LuisaRenderPy.cpython-311-x86_64-linux-gnu.so :: unknown + 319350
    12 [0x7f5f9bc97b78]: /opt/conda/lib/python3.11/site-packages/genesis/ext/LuisaRender/build/bin/LuisaRenderPy.cpython-311-x86_64-linux-gnu.so :: unknown + 158584
    13 [0x55dbf9161f36]: /opt/conda/bin/python :: unknown + 2068278
    14 [0x55dbf913f97b]: /opt/conda/bin/python :: _PyObject_MakeTpCall + 667
    15 [0x55dbf914cf94]: /opt/conda/bin/python :: _PyEval_EvalFrameDefault + 1684
    16 [0x55dbf91717df]: /opt/conda/bin/python :: _PyFunction_Vectorcall + 383
    17 [0x55dbf9150ec7]: /opt/conda/bin/python :: _PyEval_EvalFrameDefault + 17863
    18 [0x55dbf9203c5d]: /opt/conda/bin/python :: unknown + 2731101
    19 [0x55dbf920339f]: /opt/conda/bin/python :: PyEval_EvalCode + 159
    20 [0x55dbf922131a]: /opt/conda/bin/python :: unknown + 2851610
    21 [0x55dbf921cf93]: /opt/conda/bin/python :: unknown + 2834323
    22 [0x55dbf9232540]: /opt/conda/bin/python :: unknown + 2921792
    23 [0x55dbf9231ecc]: /opt/conda/bin/python :: _PyRun_SimpleFileObject + 444
    24 [0x55dbf9231c64]: /opt/conda/bin/python :: _PyRun_AnyFileObject + 68
    25 [0x55dbf922c233]: /opt/conda/bin/python :: Py_RunMain + 899
    26 [0x55dbf91f3617]: /opt/conda/bin/python :: Py_BytesMain + 55
    27 [0x7f6111d18d90]: /lib/x86_64-linux-gnu/libc.so.6 :: unknown + 171408
    28 [0x7f6111d18e40]: /lib/x86_64-linux-gnu/libc.so.6 :: __libc_start_main + 128
    29 [0x55dbf91f34ca]: /opt/conda/bin/python :: unknown + 2663626
Aborted (core dumped)

Nulliik avatar Dec 26 '24 22:12 Nulliik

OptiX library could not be loaded.

Can you check if you have OptiX on your machine (like find /usr -name "libnvoptix.so*")?

Kashu7100 avatar Dec 27 '24 02:12 Kashu7100

Can you check if you have OptiX on your machine (like find /usr -name "libnvoptix.so*")?

It returns nothing, I am not sure where exactly I suppose to have it installed (on my docker or in my Windows system), but current docker image doesn't feature it for me

Nulliik avatar Dec 27 '24 05:12 Nulliik

I didn't include the OptiX lib in the Docker (because of license). I think you need to set it up on your host machine.

Kashu7100 avatar Dec 27 '24 09:12 Kashu7100

@Kashu7100 I don't think optix is relevant here? He is just using pyrender

zhouxian avatar Dec 27 '24 17:12 zhouxian

I encounter the same error on Linux without OptiX, so I doubt it is causing this error. Luisa requires OptiX

Kashu7100 avatar Dec 28 '24 02:12 Kashu7100

@Nulliik Could you check if other examples (like rigid, etc.) work on your side?

Kashu7100 avatar Dec 28 '24 08:12 Kashu7100

@Kashu7100 sorry for takig long, other examples do work, I can even see window rendered, but it still uses CPU no matter what, my Xserver is running under Windows.

Nulliik avatar Jan 09 '25 19:01 Nulliik

Following other issues https://github.com/Genesis-Embodied-AI/Genesis/issues/43 I see that my openGL renderer is stuck to cpu, following those I end up with

glx: failed to create drisw screen
display: 192.168.1.77:0  screen: 0
direct rendering: No (If you want to find out why, try setting LIBGL_DEBUG=verbose)
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: NVIDIA GeForce RTX 4090/PCIe/SSE2
OpenGL version string: 1.4 (4.6.0 NVIDIA 566.36)

X Error of failed request:  GLXBadCurrentWindow
  Major opcode of failed request:  146 (GLX)
  Minor opcode of failed request:  5 (X_GLXMakeCurrent)
  Serial number of failed request:  53
  Current serial number in output stream:  53

Nulliik avatar Jan 09 '25 20:01 Nulliik

@Nulliik Can you check if this issue has been resolved on the main branch ?

duburcqa avatar Mar 03 '25 08:03 duburcqa

@duburcqa currently I can confirm running Genesis under native Windows without this trouble after your updates, will test docker now

Nulliik avatar Mar 03 '25 09:03 Nulliik

@duburcqa Tested the docker setup (note that not under wsl) and using X server it is still falling back to cpu, I think I either need to keep with wsl setup or move to windows as it runs smoothly now.

My curent not working setup: Container → (X11 over TCP) → Windows (MobaXterm X Server) → Display

Nulliik avatar Mar 03 '25 10:03 Nulliik

OK thank you. This is unfortunate. Did you try forcing EGL has you did before ? I think it should work this time (as long as EGL is properly configured inside your container). Are you facing a similar issue with other graphical apps running inside your container ?

duburcqa avatar Mar 03 '25 10:03 duburcqa

@duburcqa I am not sure if EGL is configured properly, but forcing EGL resulted in following error

Traceback (most recent call last):
  File "/workspace/examples/tutorials/visualization.py", line 45, in <module>
    scene.build()
  File "/opt/conda/lib/python3.11/site-packages/genesis/utils/misc.py", line 39, in wrapper
    return method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genesis/engine/scene.py", line 607, in build
    self._visualizer.build()
  File "/opt/conda/lib/python3.11/site-packages/genesis/vis/visualizer.py", line 100, in build
    self._viewer.build(self._scene)
  File "/opt/conda/lib/python3.11/site-packages/genesis/vis/viewer.py", line 82, in build
    self._pyrender_viewer = pyrender.Viewer(
                            ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/viewer.py", line 400, in __init__
    raise OpenGL.error.Error("Invalid OpenGL context.")
OpenGL.error.Error: Invalid OpenGL context.

Nulliik avatar Mar 03 '25 11:03 Nulliik

From what I see, I suspect that GPU-accelerated onscreen rendering is broken on your setup. I think it would fallback to software rendering for any graphical application.

I guess you could check that by running firefox and following these instructions: "You can check hardware acceleration state at about:support page, look at Compositing row. If there's WebRender, you're running on hardware. If there's WebRender (software) you're on non-accelerated backend." (source fedora doc).

duburcqa avatar Mar 03 '25 11:03 duburcqa

Can confirm it fallbacks Image

Nulliik avatar Mar 03 '25 11:03 Nulliik

Ok, so, unfortunately, nothing can be done on Genesis side to help you at this point. I suggest moving to offscreen rendering as it may still be GPU accelerated (if you are lucky), or transitioning to WSL2 / native Windows OS (which is alright for Genesis, but may be problematic for other dependencies in any). Closing!

duburcqa avatar Mar 03 '25 12:03 duburcqa

Thank you for your help with investigation! I'll note if I found any solution

Nulliik avatar Mar 03 '25 12:03 Nulliik

@duburcqa I am not sure if EGL is configured properly, but forcing EGL resulted in following error

Traceback (most recent call last):
  File "/workspace/examples/tutorials/visualization.py", line 45, in <module>
    scene.build()
  File "/opt/conda/lib/python3.11/site-packages/genesis/utils/misc.py", line 39, in wrapper
    return method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genesis/engine/scene.py", line 607, in build
    self._visualizer.build()
  File "/opt/conda/lib/python3.11/site-packages/genesis/vis/visualizer.py", line 100, in build
    self._viewer.build(self._scene)
  File "/opt/conda/lib/python3.11/site-packages/genesis/vis/viewer.py", line 82, in build
    self._pyrender_viewer = pyrender.Viewer(
                            ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/genesis/ext/pyrender/viewer.py", line 400, in __init__
    raise OpenGL.error.Error("Invalid OpenGL context.")
OpenGL.error.Error: Invalid OpenGL context.

have you solved this problem ? I met the same one, whatever egl or osmesa will cause the same problem: OpenGL.error.Error: Invalid OpenGL context.

Haroldlhl avatar Mar 31 '25 08:03 Haroldlhl

@Haroldlhl Did you installed Genesis main branch ?

duburcqa avatar Mar 31 '25 08:03 duburcqa

@Haroldlhl Did you installed Genesis main branch ?

I have already installed it. I locate that the problem is that the pyrender.Viewer class cannot be properly initialized, In Genesis/genesis/vis/viewer.py line 73. But the initialization process involves multiple threads, and I can't continue to locate the problem.

Image I use Ubuntu 20.04, Intel i7 14th, RTX4060

Haroldlhl avatar Mar 31 '25 11:03 Haroldlhl

Ok I see. Is the Firefox check mentioned here is successful on your machine ?

duburcqa avatar Mar 31 '25 11:03 duburcqa

You could specify run_in_thread=False in ViewerOptions to help locate the issue.

duburcqa avatar Mar 31 '25 11:03 duburcqa

Ok I see. Is the Firefox check mentioned here is successful on your machine ?

I forbid the Nouveau driver, open the Performance Mode and reboot , the firefox still shows that "Compositing WebRender (Software)", There doesn't seem to be a uniform switch that keeps all applications on the system in hardware rendering mode, and I looked up "in Python's OpenGL drawing programs, usually no additional Settings are required to enable hardware acceleration, as long as the system environment is configured correctly, OpenGL will automatically try to use the available GPU for rendering."

I locate the problems appeared in the Genesis/Genesis/ext/pyrender/texture. Py line 219, is a C function calls a python wrapper, I don't understand how this works. Perhaps the path of the C function is not correctly identified?

Haroldlhl avatar Mar 31 '25 14:03 Haroldlhl

the firefox still shows that "Compositing WebRender (Software)"

If Firefox cannot run of GPU, then I don't think there is anything wrong with Genesis. As you are already quoting: "as long as the system environment is configured correctly". It is the same fore Firefox. So I think your environment is not configured correctly. Did you manage to get any program running on GPU already ?

duburcqa avatar Mar 31 '25 14:03 duburcqa

Just in case, can you post the updated python traceback of the error ?

duburcqa avatar Mar 31 '25 14:03 duburcqa