mlx-examples
mlx-examples copied to clipboard
Usage of convert and fuse ?
In the first step, if I didn't use python convert.py -q to generate a quantized model, is that mean it is unnecessay to use -d, --de-quantize parameter to generate a de-quantized model when running the command python fuse.py ?
If you are not using the quant model to do the fine-tuning, you shouldn't de-quantize it during fuse. And if you try to de-quantize a non-quant model, fuse may throw an error.