PaddleNLP icon indicating copy to clipboard operation
PaddleNLP copied to clipboard

[Question]: ERNIE-Layout模型微调:ZeroDivisionError: float division by zero

Open ZzyChris97 opened this issue 3 years ago • 2 comments

请提出你的问题

运行ERNIE—Layout模型报错,命令为readme页面的命令,原封不动

python -u run_ner.py \
  --model_name_or_path ernie-layoutx-base-uncased \
  --output_dir ./ernie-layoutx-base-uncased/models/xfund_zh/ \
  --dataset_name xfund_zh \
  --do_train \
  --do_eval \
  --lang "ch" \
  --max_steps 20000 \
  --eval_steps 100 \
  --save_steps 100 \
  --save_total_limit 1 \
  --load_best_model_at_end \
  --pattern ner-bio \
  --preprocessing_num_workers 4 \
  --overwrite_cache false \
  --use_segment_box \
  --doc_stride 128 \
  --target_size 1000 \
  --per_device_train_batch_size 4 \
  --per_device_eval_batch_size 4 \
  --learning_rate 1e-5 \
  --lr_scheduler_type constant \
  --gradient_accumulation_steps 1 \
  --seed 1000 \
  --metric_for_best_model eval_f1 \
  --greater_is_better true \
  --overwrite_output_dir

错误信息

[2022-10-27 16:31:59,381] [    INFO] - 
Training completed. 

Traceback (most recent call last):
  File "/home/hr/projects/ernie-layout/run_ner.py", line 228, in <module>
    main()
  File "/home/hr/projects/ernie-layout/run_ner.py", line 194, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/home/hr/anaconda3/lib/python3.9/site-packages/paddlenlp/trainer/trainer_base.py", line 652, in train
    train_loss = self._total_loss_scalar / self.state.global_step
ZeroDivisionError: float division by zero
  0%|                                                                             | 0/20000 [00:00<?, ?it/s]

ZzyChris97 avatar Oct 27 '22 08:10 ZzyChris97

另外我想问一下,怎么构建和标注自己的数据集进行微调,有这方面的资料吗?

ZzyChris97 avatar Oct 27 '22 08:10 ZzyChris97

@ZzyChris97 你好,可以检查下数据集是否正常加载。数据标注目前还没有开源,会在近期开出来,欢迎持续关注

image

linjieccc avatar Oct 27 '22 11:10 linjieccc

楼主,问题解决了吗,求指教[email protected]

taojing7 avatar Dec 15 '22 07:12 taojing7

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] avatar Feb 14 '23 00:02 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。

github-actions[bot] avatar Feb 28 '23 00:02 github-actions[bot]