TensorRT-LLM pip install -e . does not work

System Info

x86, H100, Ubuntu

Who can help?

No response

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

On a system running CUDA 12.3 and H100, I installed the dependencies by running scripts referred to by Dockerfile.multi: https://github.com/NVIDIA/TensorRT-LLM/blob/0ab9d17a59c284d2de36889832fe9fc7c8697604/docker/Dockerfile.multi#L8-L51 by setting ENV to ~/.bashrc.

This allowed me to run the following command to build TensorRT-LLM from source code:

pip install -e . --extra-index-url https://pypi.nvidia.com

The building process is very fast, which does not look right, because it usually takes 40 minutes for build_wheel.py to build everything.

After the building, pip list shows that tensorrt-llm is installed.

$ pip list | grep tensorrt
tensorrt                 9.2.0.post12.dev5
tensorrt-llm             0.9.0.dev2024020600 /root/TensorRT-LLM

However, importing it would error:

$ pythonon3 -c 'import tensorrt_llm'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/root/TensorRT-LLM/tensorrt_llm/__init__.py", line 44, in <module>
    from .hlapi.llm import LLM, ModelConfig
  File "/root/TensorRT-LLM/tensorrt_llm/hlapi/__init__.py", line 1, in <module>
    from .llm import LLM, ModelConfig
  File "/root/TensorRT-LLM/tensorrt_llm/hlapi/llm.py", line 17, in <module>
    from ..executor import (GenerationExecutor, GenerationResult,
  File "/root/TensorRT-LLM/tensorrt_llm/executor.py", line 11, in <module>
    import tensorrt_llm.bindings as tllm
ModuleNotFoundError: No module named 'tensorrt_llm.bindings'

Expected behavior

My project requires me to build the main branch of TensorRT-LLM. It would be great if pip install could work, so I could declare TensorRT-LLM as a dependency in my project's pyproject.toml file.

actual behavior

I had to build TensorRT-LLM by invoking build_wheel.py as in https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/build_from_source.md#build-tensorrt-llm

additional notes

I was able to build vLLM with the CUDA kernels using pip -e .. Not sure if we could take their build setup as a reference.

Feb 07 '24 21:02 wangkuiyi

to install the main brach, you can use the following command pip3 install tensorrt_llm -U --pre --extra-index-url https://pypi.nvidia.com

also check the readme for more detail

Feb 08 '24 07:02 TobyGE

That doesn’t work. As described above, the project requires TensorRT-LLM built from the main branch.

Feb 08 '24 08:02 wangkuiyi

I used and it worked, my current version is dev2024020600

your cuda version is too high, follow the instruction and use 12.1

On Thu, Feb 8, 2024 at 00:00 Yi Wang @.***> wrote:

That doesn’t work. As described above, the project requires TensorRT-LLM built from the main branch.

— Reply to this email directly, view it on GitHub https://github.com/NVIDIA/TensorRT-LLM/issues/1065#issuecomment-1933539603, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFQTIVABGN6HGT6GQGGLPSDYSSA2JAVCNFSM6AAAAABC6QXWTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZTGUZTSNRQGM . You are receiving this because you commented.Message ID: @.***>

Feb 08 '24 09:02 TobyGE

It doesn’t build the main branch. It installs a most recent pre release version.

On Thu, Feb 8, 2024 at 1:17 AM Yingqiang Ge @.***> wrote:

I used and worked

On Thu, Feb 8, 2024 at 00:00 Yi Wang @.***> wrote:

That doesn’t work. As described above, the project requires TensorRT-LLM built from the main branch.

— Reply to this email directly, view it on GitHub < https://github.com/NVIDIA/TensorRT-LLM/issues/1065#issuecomment-1933539603>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/AFQTIVABGN6HGT6GQGGLPSDYSSA2JAVCNFSM6AAAAABC6QXWTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZTGUZTSNRQGM>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/NVIDIA/TensorRT-LLM/issues/1065#issuecomment-1933652221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAL2DZ4BWZNGI6DS5MK2Y43YSSJ2HAVCNFSM6AAAAABC6QXWTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZTGY2TEMRSGE . You are receiving this because you authored the thread.Message ID: @.***>

Feb 08 '24 21:02 wangkuiyi

I looks like pip install -e . does not automatically trigger the buiding of the Python binding of the C++ runtime.

Feb 09 '24 20:02 wangkuiyi

@Shixiaowei02 , can you help with that issue, please?

Feb 11 '24 12:02 jdemouth-nvidia

@wangkuiyi , for your information, @Shixiaowei02 is based in China. It means that he won't be able to work on this issue before the end of the break for the Chinese New Year.

Feb 11 '24 12:02 jdemouth-nvidia

Thank you @jdemouth-nvidia and @Shixiaowei02 ! No rush please. It is totally fine after the lunar new year.

Feb 11 '24 15:02 wangkuiyi

I am working on fixing this issue now. Thanks for your support!

Feb 20 '24 07:02 Shixiaowei02

Thanks! I am also facing this issue.

Feb 20 '24 08:02 ekagra-ranjan

I am facing the same issue.

Feb 26 '24 06:02 andyluo7

Can you use these two commands to temporarily bypass this issue? We will fix this issue in the near future and synchronize it to the main branch. Thank you! @wangkuiyi

python3 scripts/build_wheel.py --trt_root /usr/local/tensorrt
pip3 install -e .

Feb 29 '24 02:02 Shixiaowei02

After build wheel & editable install, I still got the same error

    import tensorrt_llm.bindings as tllm
ModuleNotFoundError: No module named 'tensorrt_llm.bindings'

Mar 02 '24 06:03 lifelongeeek

Currently, the calling relationship between build_wheel.py and setup.py is inverted, resulting in incomplete installation when users run pip install -e .. Meanwhile, setup.py has been deprecated, so give a friendlier error as a stopgap here. We will come back and refactor when we have the bandwidth later. Thank you!

Apr 11 '24 08:04 Shixiaowei02

a friendlier error @Shixiaowei02 I have got this error in recently released tensorrt-llm v0.9.0. ,please give me some advice to fix it ,tks!

`python3 -c "import tensorrt_llm; print(tensorrt_llm.version)" Traceback (most recent call last): File "/opt/workspace/TensorRT-LLM_v0.9.0/tensorrt_llm/init.py", line 39, in import tensorrt_llm.bindings # NOQA ImportError: /opt/workspace/TensorRT-LLM_v0.9.0/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZNK12tensorrt_llm8executor25SpeculativeDecodingConfig22getAcceptanceThresholdEv

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "/opt/workspace/TensorRT-LLM_v0.9.0/tensorrt_llm/init.py", line 41, in raise ImportError( ImportError: Import of the bindings module failed. Please check the package integrity. If you are attempting to use the pip development mode (editable installation), please execute build_wheels.py first, and then run `pip install -e .``

Apr 16 '24 03:04 felixslu