FlexGen
FlexGen copied to clipboard
Add support for Llama and Qwen models
This PR is to add support for Llama and Qwen models. Based on the scripts for OPT, RMSNorm and ROPE were added, and some parameters were adjusted for corresponding model architecture.
我采纳了你的提交,在qwen测试的设置中默认把输入序列padding到128个tokens,当我修改这个输入序列长度时候会出现报错,是我没用对吗?