SpinQuant merge
Hello,
Why is the Spinquant branch not merged? Do you plan to merge it anytime soon?
Thanks
Hello,
Why is the Spinquant branch not merged? Do you plan to merge it anytime soon?
Thanks
As noted in a prior issue, the author has no immediate plans to extend SpinQuant to other models, making this unlikely anytime soon:(
Since Spinquant requires training, it may involve significant changes to the code structure. Therefore, we do not plan to merge it into the main branch in the near term. Thank you for your understanding.
Since Spinquant requires training, it may involve significant changes to the code structure. Therefore, we do not plan to merge it into the main branch in the near term. Thank you for your understanding.
I understand, thank you. Have you thought adding it without the training process? There are available ready-to-use learned matrices on SpinQuant drive (Link from SpinQuant repo: https://drive.google.com/drive/folders/1R2zix4qeXBjcmgnJN1rny93cguJ4rEE8) that the user could provide.
Perhaps the author would prefer to develop a quant framework that can adapt to various models(not just applied to llama and opt).
If a fixed rotation matrix is used, there is actually no difference from Quarot(つд⊂)
---Original--- From: @.> Date: Thu, May 8, 2025 15:23 PM To: @.>; Cc: @.@.>; Subject: Re: [ModelTC/llmc] SpinQuant merge (Issue #368)
Marouan-git left a comment (ModelTC/llmc#368)
Since Spinquant requires training, it may involve significant changes to the code structure. Therefore, we do not plan to merge it into the main branch in the near term. Thank you for your understanding.
I understand, thank you. Have you thought adding it without the training process? There are available ready-to-use learned matrices on SpinQuant drive (Link from SpinQuant repo: https://drive.google.com/drive/folders/1R2zix4qeXBjcmgnJN1rny93cguJ4rEE8) that the user could provide.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
I’ll try to allocate some time to support loading pre-trained rotation matrices directly as a feature.