cfeng16
cfeng16
Hi @ArlindKadra, I face the same problem now, and I use the AMI corpus dataset. During the training, I concatenate all the training samples as a single numpy matrix, and...
Hi, I met the same problem, Have you solved it?
Same question ^^
Does it now support gradient accumulation for multiple models?
Can we use gradient accumulation for multiple models in distributed training?
btw where can i find the tokenizer?
@Artiprocher Would you mind sharing some instructions about how to do multi-node training for WAN-1.3b by DiffSynth-Studio? I really appreciate any help you can provide.