PaddleNLP
PaddleNLP copied to clipboard
[Bug]: LLM 模型openlm-research/open_llama_3b 执行GPTQ量化失败。
软件环境
paddle-bfloat 0.1.7
paddle2onnx 1.0.8
paddlefsl 1.1.0
paddleocr 2.6.1.3
paddlepaddle-gpu 0.0.0.post118
paddleslim 0.0.0.dev0
重复问题
- [x] I have searched the existing issues
错误描述
LLM 模型openlm-research/open_llama_3b 执行GPTQ量化失败。
稳定复现步骤 & 代码
执行代码: python3 finetune_generation.py ./llama_3b/gptq_argument.json
配置文件:
{
"model_name_or_path": "openlm-research/open_llama_3b",
"per_device_train_batch_size": 8,
"per_device_eval_batch_size": 8,
"eval_accumulation_steps":16,
"src_length": 1024,
"max_length": 2048,
"fp16": true,
"fp16_opt_level": "O2",
"dataset_name_or_path": "./data",
"output_dir": "./checkpoints/llama_3b_gptq_ckpts",
"do_eval": true,
"eval_with_do_generation": false,
"do_gptq": true,
"gptq_step": 8,
"quant_type": "weight_only_int4"
}
报错信息:
部分模型的确遇到过这个问题,应该是模型精度的问题了,我们单测选择的是llama2-7b没有这个问题
This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。