CogVideo icon indicating copy to clipboard operation
CogVideo copied to clipboard

Assertion Error: world_size % group_num == 0 When finetune cogv1.5-i2v on distributed environment

Open hrcheng98 opened this issue 1 year ago • 0 comments

System Info / 系統信息

image

Information / 问题信息

  • [ ] The official example scripts / 官方的示例脚本
  • [X] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

  1. run code on 2*8 GPU
  2. find assertion error world_size % group_num != 0

Expected behavior / 期待表现

I found group_num in loss_fn_config in yaml is 40. Should the group_num be modified according to the machine amount I am using?

hrcheng98 avatar Nov 15 '24 14:11 hrcheng98