ZX-ModelCloud
ZX-ModelCloud
The maximum text length of the calibration_dataset you're using is **40123**. This would take up too much memory. You can limit the maximum length. ``` from datasets import load_dataset from...
Regarding bug 1: **positional_embedding** is a non-persistent buffer, but it is being written to safetensors after calling **gptqmodel.utils.model.get_state_dict_for_save().** This error should be related to **offload_disk**. I am checking the relevant...
Regarding bug 2/3: It has been fixed in the main branch code of vllm. https://github.com/vllm-project/vllm/pull/29896/files#diff-a65936ff683c1b4c8d7f3cdd49c28022f38d5e7cfbee857e7dc8c4f6731af0f9R1141-R1152
> Regarding bug 1: > > **positional_embedding** is a non-persistent buffer, but it is being written to safetensors after calling **gptqmodel.utils.model.get_state_dict_for_save().** This error should be related to **offload_disk**. I am...