SimplerEnv icon indicating copy to clipboard operation
SimplerEnv copied to clipboard

Error: undefined symbol: ncclCommRegister

Open KkkyleZ opened this issue 1 year ago • 0 comments

2024-08-17 12:02:29.525707: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-17 12:02:29.525757: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-08-17 12:02:29.526717: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-08-17 12:02:29.531867: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-17 12:02:30.139786: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
  File "/home/Projects/SimplerEnv-OpenVLA/simpler_env/main_inference.py", line 10, in <module>
    from simpler_env.policies.openvla.openvla_model import OpenVLAInference
  File "/home/Projects/SimplerEnv-OpenVLA/simpler_env/policies/openvla/openvla_model.py", line 6, in <module>
    from transformers import AutoModelForVision2Seq, AutoProcessor
  File "/home/Projects/.venv/lib/python3.10/site-packages/transformers/__init__.py", line 26, in <module>
    from . import dependency_versions_check
  File "/home/Projects/.venv/lib/python3.10/site-packages/transformers/dependency_versions_check.py", line 16, in <module>
    from .utils.versions import require_version, require_version_core
  File "/home/Projects/.venv/lib/python3.10/site-packages/transformers/utils/__init__.py", line 34, in <module>
    from .generic import (
  File "/home/Projects/.venv/lib/python3.10/site-packages/transformers/utils/generic.py", line 462, in <module>
    import torch.utils._pytree as _torch_pytree
  File "/home/Projects/.venv/lib/python3.10/site-packages/torch/__init__.py", line 239, in <module>
    from torch._C import *  # noqa: F403
ImportError: /home/Projects/.venv/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so: undefined symbol: ncclCommRegister

KkkyleZ avatar Aug 17 '24 16:08 KkkyleZ