xiangxinhello comments

Results 27 comments of


                                            xiangxinhello

Qwen-VL-Chat vit embedding diff

> OK， if you have strong desire to use FP16, I'll continue to look at this issue, if not, this issue will have a lower priority. Hi @sunnyqgg . The...

Bug of background

![sample-500](https://github.com/showlab/Tune-A-Video/assets/169245314/7b9ff89d-5081-48fd-9949-fcf385f27cb2) I think the reconstruction is very poor, is there anything that can be fixed and improved. The main problem is that the reconstruction is a little distorted. Looking forward...

我用你们的预训练权重Test，如何改变增加分辨率的配置。

![2024-05-22 10-32-43屏幕截图](https://github.com/Zj-BinXia/SSL/assets/169245314/6a95f20b-f5d6-40d6-816a-2983f5d2f559) ![2024-05-22 10-36-47屏幕截图](https://github.com/Zj-BinXia/SSL/assets/169245314/7294b4b6-c224-4d71-a33f-375844bc1e3d) 我用你们的预训练权重Test，没有看到可视化结果。

How to add gemm_plugin int8

> Hi, @xiangxinhello , Could you help to provide your /tmp/Qwen/7B/config.json file? Hi, @Kefeng-Duan { "architectures": [ "Qwen2ForCausalLM" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 4096, "initializer_range":...

How to add gemm_plugin int8

> @nv-guomingz for vis Hi @nv-guomingz { "architectures": [ "Qwen2ForCausalLM" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 4096, "initializer_range": 0.02, "intermediate_size": 11008, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type":...

How to add gemm_plugin int8

> I think the modification you applied on `def set_smooth_quant_plugins` doesn't make sense here. Because you convert the model by specifying the quantization mode to weights-only(W8A16) and set_smooth_quant_plugins is a...

How to add gemm_plugin int8

> It depends on the case by case. For int8 wo, the matrix mulitiplication uses fp16 is expected behaviour, right? I want to the matrix mulitiplication use int8, but trt-build...

How to add gemm_plugin int8

> try below comamnds with latest trtllm again. My local testings is pass. > > ```python > python convert_checkpoint.py > --model_dir /workspace/mnt/storage/trt/Qwen1.5-7B-Chat/ --output_dir ./tllm_checkpoint_1gpu_sq_test --dtype float16 --smoothquant 0.5 > >...

How to add gemm_plugin int8

> IIRC, TRTLLM will call smoothquant plugin when you didn't set the --gemm_plugin to None and generate the ckpt with --smoothquant knob. Check [this ](https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/quantization/quantize.py#L278)for details. So my personal habit...

如何设置参数，让每次回答的答案结果一致

@jklj077 请问一下qwen2-gptq的量化做了哪些具体的量化，只优化权重weight？