TensorRT-LLM MPT not supported by current source code?

Having problems when using MPT. Setting: AWS g5.48xlarge, CUDA 12.1.0, Ubuntu 22.04, python 3.10, pytorch 2.1.2.

root@7f51eddb66f5:/TensorRT-LLM/examples/mpt# trtllm-build --checkpoint_dir=./ft_ckpts/mpt-7b/fp16 \
             --max_batch_size 32 \
             --max_input_len 1024 \
             --max_output_len 512 \
             --use_gpt_attention_plugin \
             --use_gemm_plugin
             --workers 1 \
             --output_dir ./trt_engines/mpt-7b/fp16
[TensorRT-LLM] TensorRT-LLM version: 0.8.0.dev2024012302Traceback (most recent call last):
  File "/usr/local/bin/trtllm-build", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 369, in main
    parallel_build(source, build_config, args.output_dir, workers,
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 293, in parallel_build
    build_and_save(rank, rank % workers, ckpt_dir, build_config,
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 267, in build_and_save
    engine = build(build_config,
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 244, in build
    raise RuntimeError(
RuntimeError: Unsupported model architecture: MPTForCausalLM
bash: --workers: command not found

Jan 31 '24 19:01 TobyGE

https://github.com/NVIDIA/TensorRT-LLM/blob/3d56a445e8ebf888e78be638faf6beec0a78f3c2/tensorrt_llm/commands/build.py#L232-L236

In the latest main branch, the RuntimeError is in line 234.

@TobyGE Could you please update to the main branch?

Feb 01 '24 06:02 QiJune

I used this command this morning. pip3 install tensorrt_llm -U --pre --extra-index-url https://pypi.nvidia.com My current version is [TensorRT-LLM] TensorRT-LLM version: 0.8.0.dev20240123020.8.0.dev2024012302 Do you mean it is still not the latest version? If so, what command should I use to install the latest branch?

Feb 01 '24 07:02 TobyGE

@Shixiaowei02 It seems that the tensorrt_llm in pypi source is not the latest one. Could you please help to confirm?

Feb 01 '24 07:02 QiJune

We have fixed this issue, thanks!

Feb 01 '24 14:02 Shixiaowei02

Thanks, this is my current version TensorRT-LLM version: 0.8.0.dev2024013000 now.

I kept following the instruction of mpt: tp=1 works well, but there are problems when tp=4. Can you re-open this issue or should I open a new issue?

Feb 01 '24 18:02 TobyGE

#1034

I opened a new issue to discuss, and found someone else also facing similar problem. pls check this out.

Feb 02 '24 08:02 TobyGE

convert_checkpoint.py for MPT is not synced with the latest llm_foundry mpt model.

Apr 30 '24 06:04 surgan12