MPT not supported by current source code?
Having problems when using MPT. Setting: AWS g5.48xlarge, CUDA 12.1.0, Ubuntu 22.04, python 3.10, pytorch 2.1.2.
root@7f51eddb66f5:/TensorRT-LLM/examples/mpt# trtllm-build --checkpoint_dir=./ft_ckpts/mpt-7b/fp16 \
--max_batch_size 32 \
--max_input_len 1024 \
--max_output_len 512 \
--use_gpt_attention_plugin \
--use_gemm_plugin
--workers 1 \
--output_dir ./trt_engines/mpt-7b/fp16
[TensorRT-LLM] TensorRT-LLM version: 0.8.0.dev2024012302Traceback (most recent call last):
File "/usr/local/bin/trtllm-build", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 369, in main
parallel_build(source, build_config, args.output_dir, workers,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 293, in parallel_build
build_and_save(rank, rank % workers, ckpt_dir, build_config,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 267, in build_and_save
engine = build(build_config,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 244, in build
raise RuntimeError(
RuntimeError: Unsupported model architecture: MPTForCausalLM
bash: --workers: command not found
https://github.com/NVIDIA/TensorRT-LLM/blob/3d56a445e8ebf888e78be638faf6beec0a78f3c2/tensorrt_llm/commands/build.py#L232-L236
In the latest main branch, the RuntimeError is in line 234.
@TobyGE Could you please update to the main branch?
I used this command this morning.
pip3 install tensorrt_llm -U --pre --extra-index-url https://pypi.nvidia.com
My current version is [TensorRT-LLM] TensorRT-LLM version: 0.8.0.dev20240123020.8.0.dev2024012302
Do you mean it is still not the latest version? If so, what command should I use to install the latest branch?
@Shixiaowei02 It seems that the tensorrt_llm in pypi source is not the latest one. Could you please help to confirm?
We have fixed this issue, thanks!
Thanks, this is my current version TensorRT-LLM version: 0.8.0.dev2024013000 now.
I kept following the instruction of mpt: tp=1 works well, but there are problems when tp=4. Can you re-open this issue or should I open a new issue?
#1034
I opened a new issue to discuss, and found someone else also facing similar problem. pls check this out.
convert_checkpoint.py for MPT is not synced with the latest llm_foundry mpt model.