Fanrong Li comments

Results 59 comments of


                                            Fanrong Li

fix：fix illeagel memory access when mtp >= 2

Plz also open the [hopper MTP tests](https://github.com/NVIDIA/TensorRT-LLM/blob/a33c595c88bc559a8eb02fa8c9d1e9f99fb1e89e/tests/unittest/_torch/multi_gpu_modeling/test_deepseek.py#L77), thx~

fix：fix illeagel memory access when mtp >= 2

/bot reuse-pipeline

[model support] please support mamba-codestral-7B-v0.1

Now we can support Mamba2 model with the HF Mamba2 config format: https://huggingface.co/state-spaces/mamba2-2.7b/blob/main/config.json. For the mamba-codestral-7B-v0.1, you can create a new config.json from the existing params.json and make it similar...

[model support] please support mamba-codestral-7B-v0.1

We added a mamba-codestral-7B-v0.1 exampel in today's update. Please refer to https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/mamba and have a try.

[model support] please support mamba-codestral-7B-v0.1

> cannot install tensorrt_llm==0.12.0.dev2024072301 You need to reinstall tensorrt_llm.

[model support] please support mamba-codestral-7B-v0.1

I cannot reproduce this error. Can you share your command?

[model support] please support mamba-codestral-7B-v0.1

> Are there plans to support tp>1 @lfr-0531? Coming soon.

KeyError: 'ChatGLMForConditionalGeneration',glm4-9b,

We can support the glm4-9b model, but cannot support the LongWriter-glm4-9b model now. For glm4-9b model: ```shell git clone https://huggingface.co/THUDM/glm-4-9b glm_4_9b python3 convert_checkpoint.py --model_dir glm_4_9b --output_dir trt_ckpt/glm_4_9b/fp16/1-gpu trtllm-build --checkpoint_dir trt_ckpt/glm_4_9b/fp16/1-gpu...

fix: The constructor checks useDynamicTree but doesn’t validate dynamicTreeMaxTopK if set

/bot run

Invalid MIT-MAGIC-COOKIE-1 key

I cannot reproduce this issue locally. Can you have a try on the latest main branch? And follow the [install doc](https://nvidia.github.io/TensorRT-LLM/installation/linux.html#installing-on-linux) to correctly install TensorRT-LLM.