auto-round lm_head is not converted to QuantLinear with MXFP4/8

lm_head quantization still have some issues.

need deepcopy if tied_word_embedding = True
export is not applied for lm_head

Shall we warn user that lm_head is not supported? @WeiweiZhang1 @wenhuach21

Nov 17 '25 05:11 xin3he

BTW, AFAIK, QuantLinear for MXFP4/8 has no forward function and may confuse user about how to use it. Do we plan to support it?

Nov 17 '25 05:11 xin3he

if tied_word_embedding = True, lm-head quant is disabled by default. What's the issue? what do you mean "QuantLinear for MXFP4/8 has no forward function "

Nov 17 '25 05:11 wenhuach21

If user prefer to quantize lm_head, what's the solution?

Nov 17 '25 05:11 xin3he

how do you run the model, why quantlinear has no forward?

Nov 17 '25 05:11 wenhuach21

https://github.com/intel/auto-round/blob/8d8a1cd5daaf6e8c71d079eccaec3092fa9af4f1/auto_round/export/export_to_autoround/qlinear_fp.py#L61 It's not implemented in AutoRound

Nov 17 '25 05:11 xin3he

how do you use the model? please attach the cmd. After packing and saving, the model should be reloaded and MXFP4QuantLinear this layer should be called

Nov 17 '25 05:11 wenhuach21

I'm using the model for export after quantize_and_save() and just aware that AutoRound requires reloading before inference.

Nov 17 '25 05:11 xin3he

I tried Qwen3-8b which is not using tied_word_embeding, the lm_head is still not quantized. I notice the quantization bar contains this op while module replacement is not enabled.

Nov 17 '25 05:11 xin3he

do you enable quant_lm_head or set bits for lm-head

Nov 17 '25 05:11 wenhuach21

Nov 17 '25 05:11 xin3he

Do we plan to support lm_head for tied_word_embedding=True?

Nov 17 '25 06:11 xin3he

After reloading I saw the lm_head is quantized, not sure what is happening. @WeiweiZhang1 Do you have any comments? Do you think it's a bug or it is designed.

Nov 17 '25 06:11 xin3he

at least need to warn users that xin's way is not supported

Nov 17 '25 06:11 wenhuach21

For NVFP4, lm_head quantization will get assert error during exporting.

Nov 17 '25 07:11 xin3he