MOSS icon indicating copy to clipboard operation
MOSS copied to clipboard

How to convert a finetuned MOSS model to quantized version model? 请问如何把一个finetune过的MOSS模型转换为量化版的模型呢?

Open qgpmztmf opened this issue 2 years ago • 3 comments

I couldn't find the code to release this process in this repository. Has anyone successfully converted a finetuned MOSS model to its quantized version? If so, could you please share the steps or code used to achieve this? 没找到实现这个过程的代码,有谁成功把finetune过的moss模型转换成量化版本的模型吗?

qgpmztmf avatar May 08 '23 06:05 qgpmztmf

.

qgpmztmf avatar May 11 '23 08:05 qgpmztmf

我测了他们的int4,发现量化后的还没有量化前的推理速度快。

JIEKEXIAN avatar May 16 '23 02:05 JIEKEXIAN

我测了他们的int4,发现量化后的还没有量化前的推理速度快。

量化并不一定会提速,量化主要是为了缩小模型占用显存。

qgpmztmf avatar May 31 '23 07:05 qgpmztmf