QuaRot
QuaRot copied to clipboard
Support more models.
Thanks for the great work!
This PR supports more models of LLaMA/Qwen2/Mistral. It also supports the model who has attention_bias (e.g., Qwen2.5 models).