GaoYuYang
GaoYuYang
Thanks for your reply , i'will try it
i have try these two method . i cant find duplicate node or tensor.using polygraphy. And it still has the same error afert using fold-constants But due to confidentiality requirementsd,i...
> I follow the file change to change the file ,but when I use the below code : `python -m vllm.entrypoints.openai.api_server --model /root/DeepSeek-V2-Chat --trust-remote-code` there is a error happen: >...
> @fengyang95 , as promised, little update: we solved BF16 precision issue and tested output aligned with HF model currently, please bare us to spend some time to package up...
@luccafong Nice work,inspired me a lot! I have two very small problems that have been bothering me. 1. I notice you use luccafong/deepseek_mtp_main_random model . I wonder is it only...
@KuntaiDu hi, i just want to know what's the relationship between your work and the pr[#2809](https://github.com/vllm-project/vllm/pull/2809) . it seems you both want to implement something like "disaggregated prefilling" to separate...