exo running on a Jetson Orin NX

Good morning,

I have been trying to make the exo project work on my Orin NX without success, here is the error I am getting when running exo:

(exo) sgoudelis@jetson:~/projects/exo$ exo 
Selected inference engine: None

  _____  _____  
 / _ \ \/ / _ \ 
|  __/>  < (_) |
 \___/_/\_\___/ 
    
Detected system: Linux
Inference engine name after selection: tinygrad
Traceback (most recent call last):
  File "/home/sgoudelis/miniconda3/envs/exo/bin/exo", line 33, in <module>
    sys.exit(load_entry_point('exo', 'console_scripts', 'exo')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/miniconda3/envs/exo/bin/exo", line 25, in importlib_load_entry_point
    return next(matches).load()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/importlib/metadata/__init__.py", line 205, in load
    module = import_module(match.group('module'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/importlib/__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 999, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "/home/sgoudelis/projects/exo/exo/main.py", line 106, in <module>
    inference_engine = get_inference_engine(inference_engine_name, shard_downloader)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/projects/exo/exo/inference/inference_engine.py", line 69, in get_inference_engine
    from exo.inference.tinygrad.inference import TinygradDynamicShardInferenceEngine
  File "/home/sgoudelis/projects/exo/exo/inference/tinygrad/inference.py", line 4, in <module>
    from exo.inference.tinygrad.models.llama import Transformer, TransformerShard, convert_from_huggingface, fix_bf16, sample_logits
  File "/home/sgoudelis/projects/exo/exo/inference/tinygrad/models/llama.py", line 2, in <module>
    from tinygrad import Tensor, Variable, TinyJit, dtypes, nn, Device
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/__init__.py", line 5, in <module>
    from tinygrad.tensor import Tensor                                    # noqa: F401
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/tensor.py", line 12, in <module>
    from tinygrad.device import Device, BufferSpec
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/device.py", line 226, in <module>
    class CPUProgram:
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/tinygrad/device.py", line 227, in CPUProgram
    helper_handle = ctypes.CDLL(ctypes.util.find_library('System' if OSX else 'kernel32' if sys.platform == "win32" else 'gcc_s'))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/ctypes/__init__.py", line 379, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: /home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so: invalid ELF header

looking into the so file I get this:

(exo) sgoudelis@jetson:~/projects/exo$ file /home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so
/home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so: ASCII text
(exo) sgoudelis@jetson:~/projects/exo$ more /home/sgoudelis/miniconda3/envs/exo/lib/libgcc_s.so
/* GNU ld script
   Use the shared library, but some functions are only in
   the static library.  */
GROUP ( libgcc_s.so.1 -lgcc )

Anyone had any idea how to make exo work on the Orin Jetson ?

UPDATE:

Moving the mentioned static object file out of the way actually makes exo go further. It does fail in another way:

Traceback (most recent call last):
  File "/home/sgoudelis/miniconda3/envs/exo/bin/exo", line 33, in <module>
    sys.exit(load_entry_point('exo', 'console_scripts', 'exo')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/projects/exo/exo/main.py", line 385, in run
    loop.run_until_complete(main())
  File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
  File "/home/sgoudelis/projects/exo/exo/main.py", line 349, in main
    await node.start(wait_for_peers=args.wait_for_peers)
  File "/home/sgoudelis/projects/exo/exo/orchestration/node.py", line 59, in start
    self.device_capabilities = await device_capabilities()
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/projects/exo/exo/topology/device_capabilities.py", line 153, in device_capabilities
    return await linux_device_capabilities()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/projects/exo/exo/topology/device_capabilities.py", line 188, in linux_device_capabilities
    gpu_memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/pynvml.py", line 2934, in nvmlDeviceGetMemoryInfo
    _nvmlCheckReturn(ret)
  File "/home/sgoudelis/miniconda3/envs/exo/lib/python3.12/site-packages/pynvml.py", line 979, in _nvmlCheckReturn
    raise NVMLError(ret)
pynvml.NVMLError_NotSupported: Not Supported

I am a complete noob when it comes to NVIDIA CUDA stuff btw. I am guessing this happens because the Orin has shared memory.

ANOTHER UPDATE:

Exo does work with the Orin NX 16GB, by bypassing the part of the code is querying the VRAM amount and giving it a bogus number does make exo boot up just fine and also have GPU accelerated inference.

I would love for some feedback from one of the developers of the Exo project about this. Please feel free to comment.

Feb 19 '25 06:02 sgoudelis

I worked on tow xavier AGX, exo cannot work, need libnvidia-ml.so.1

Detected system: Linux Inference engine name after selection: tinygrad Using inference engine: TinygradDynamicShardInferenceEngine with shard downloader: SingletonShardDownloader [] Chat interface started:

http://192.168.1.15:52415
http://172.17.0.1:52415
http://127.0.0.1:52415 ChatGPT API endpoint served at:
http://192.168.1.15:52415/v1/chat/completions
http://172.17.0.1:52415/v1/chat/completions
http://127.0.0.1:52415/v1/chat/completions has_read=True, has_write=True Traceback (most recent call last): File "/home/-----/exo/.venv/lib/python3.12/site-packages/pynvml.py", line 2248, in _LoadNvmlLibrary nvmlLib = CDLL("libnvidia-ml.so.1") ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/-----/miniconda3/lib/python3.12/ctypes/init.py", line 379, in init self._handle = _dlopen(self._name, mode) ^^^^^^^^^^^^^^^^^^^^^^^^^ OSError: libnvidia-ml.so.1: cannot open shared object file: No such file or directory

Feb 25 '25 11:02 bigcatlovefish

Same error "invalid ELF header" on WSL2

Feb 27 '25 23:02 Youho99

I successfully run exo on jetson orin nx 16gb using "source install.sh", and "pip install -e ." in conda-python3.12 will occur the same error about "anaconda3/envs/exo/lib/libgcc_s.so: invalid ELF header"。

What's more, the another error is "pynvml.NVMLError_NotSupported: Not Supported" that is because the jetson devices do not support "pynvml". It's necessary to change the functions that get the gpu and memory information of jetson devices, and it's not so difficult.

Mar 11 '25 07:03 Mr-lwd

我在 jetson orin nx 16gb 上使用“source install.sh”成功运行了 exo，在 conda-python3.12 中“pip install -e ”将出现相同的错误“anaconda3/envs/exo/lib/libgcc_s.so： invalid ELF header”。

此外，另一个错误是 “pynvml.NVMLError_NotSupported：不支持“，这是因为 Jetson 设备不支持”pynvml”。需要更改获取 jetson 设备的 gpu 和内存信息的函数，而且没有那么难。

    try:
        with open("/proc/device-tree/compatible") as f:
            compatible = f.read().lower()
            if "tegra194" in compatible:
                gpu_name = "XAVIER"
            elif "tegra210" in compatible:
                gpu_name = "TX1"
            elif "tegra186" in compatible:
                gpu_name = "TX2"
            elif "tegra234" in compatible:
                gpu_name = "Jetson_NX"    
            else:
                gpu_name = "JETSON_GPU"
        
        with open("/proc/meminfo") as f:
            for line in f:
                if "MemTotal" in line:
                    total_mem = int(line.split()[1]) * 1024 
                    break
            else:
                total_mem = 0
        
        gpu_memory_info = type('',(object,),{"total": total_mem})()

Mar 11 '25 07:03 Mr-lwd

Here comes a problem, After I downloaded the llama3.2-8B，It can be loaded in the memory. But it was killed then.

Mar 11 '25 08:03 Mr-lwd

这里来了一个问题，我下载了 llama3.2-8B 之后，就可以加载到内存中了。但它在那时被杀死了。

I run exo on two nodes successfully, but the model seems to be loaded in memory twice or more. The inference time is too long, and the speed of llama3.2:1b is just 2tokens/s.

Mar 12 '25 06:03 Mr-lwd

@Mr-lwd yes, it loaded at least twice, the speed of llama3.2-3b, very very slow.

Mar 14 '25 05:03 bigcatlovefish