quanto icon indicating copy to clipboard operation
quanto copied to clipboard

Does AWQ is officially supported now?

Open lifelongeeek opened this issue 1 year ago • 3 comments

I can see that optimum-quanto provides several external (weight-only) quantization algorithm such as smoothquant and awq in here.

It looks like smoothquant only supports OPT models, and awq is still under development. Do you have any further development plans for AWQ?

lifelongeeek avatar Sep 20 '24 04:09 lifelongeeek

Oh I notice that optimum-quanto offer HQQ, calibration data-free quantization algorithm and achieves fairly good perplexity with Shared-llama-1.3B. Does HQQ officially supported in optimum-quanto?

lifelongeeek avatar Sep 20 '24 04:09 lifelongeeek

HQQ and AWQ both use the same group-wise quantization scheme introduced by GPTQ. They only differ from the original GPTQ algorithm in the way they select the scale and adjust the weights (PTQ = Post-Training-Quantization). Quanto uses a similar algorithm so it is strictly equivalent to these methods, although AWQ PTQ is not implemented.

If you plan to reuse existing HQQ or AWQ weights they have sadly chosen in their implementations to store the quantized weights in different formats that quanto can't load at the moment, although it includes packing/unpacking code compatible with AWQ weights (so it should be fairly easy to write a conversion script).

dacorvo avatar Sep 20 '24 06:09 dacorvo

Thanks for the detail explanation.

although it includes packing/unpacking code compatible with AWQ weights (so it should be fairly easy to write a conversion script).

Could you inform us any relevant reference docs or code? I am interested in this direction.

lifelongeeek avatar Sep 20 '24 07:09 lifelongeeek

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Oct 21 '24 02:10 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Oct 27 '24 02:10 github-actions[bot]