[Bug]: LLM 模型openlm-research/open_llama_3b 执行GPTQ量化失败。

Open mchen11 opened this issue 2 years ago • 2 comments

软件环境

paddle-bfloat            0.1.7
paddle2onnx              1.0.8
paddlefsl                1.1.0
paddleocr                2.6.1.3
paddlepaddle-gpu         0.0.0.post118
paddleslim               0.0.0.dev0

重复问题

[x] I have searched the existing issues

错误描述

LLM 模型openlm-research/open_llama_3b 执行GPTQ量化失败。

稳定复现步骤 & 代码

执行代码： python3 finetune_generation.py ./llama_3b/gptq_argument.json

配置文件：

{
    "model_name_or_path": "openlm-research/open_llama_3b",
    "per_device_train_batch_size": 8,
    "per_device_eval_batch_size": 8,
    "eval_accumulation_steps":16,
    "src_length": 1024,
    "max_length": 2048,
    "fp16": true,
    "fp16_opt_level": "O2",
    "dataset_name_or_path": "./data",
    "output_dir": "./checkpoints/llama_3b_gptq_ckpts",
    "do_eval": true,
    "eval_with_do_generation": false,
    "do_gptq": true,
    "gptq_step": 8,
    "quant_type": "weight_only_int4"
  }

报错信息： 6b202b251fad2685ab0a6df2f4b0fd8f

Jan 11 '24 06:01 mchen11

部分模型的确遇到过这个问题，应该是模型精度的问题了，我们单测选择的是llama2-7b没有这个问题

Jan 11 '24 11:01 RachelXu7

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动，被标记为stale。

Apr 27 '24 00:04 github-actions[bot]