sunnyqgg comments

Results 20 comments of


                                            sunnyqgg

Qwen-VL-Chat vit embedding diff

Hi @bnuzhanyu, you got the "TensorrtLLM output" from the source code of "Tensorrt-llm Print input image and output embedding:", right? And how do you got the "Qwen-VL-Chat ModelScope" results? If...

Qwen-VL-Chat vit embedding diff

Hi @calico-niko @bnuzhanyu The ViT is offloaded to TRT, and the fp32 accuracy of it on TRT9.3 is alined with Pytorch. And you can also change the version of TRT...

Qwen-VL-Chat vit embedding diff

Hi @hezeli123 , the diffs are smaller compared with TRT 9.x, does the current ViT diffs have a big impact on the final results? If so, you can try to...

Qwen-VL-Chat vit embedding diff

OK， if you have strong desire to use FP16, I'll continue to look at this issue, if not, this issue will have a lower priority.

Qwen-VL inference errors

Hi @jdmdj1999 @chiquitita-101 what TRT version you are using and what kind of quantization method you're using for Qwen?

support Qwen2-VL

Hi, I'll do it.

support Qwen2-VL

Hi, the work is in progress, I'll update it ASAP.

support Qwen2-VL

Hi, the code is under review and almost done, it'll be public soon.

support Qwen2-VL

It's supported, pls see examples/multimodal for more info.

support Qwen2-VL

Hi @LugerW-A - For the Qwen2-VL 2B model, TRT_LLM is more than twice as slow as vllm. I have noticed this issue and fixed it already, hope it'll be public...