Questions about the training sh file
Thank you for open sourcing such a great work!
When I tried to train PixWizard, I found four files in the 'exps' folder: train_torchrun_pixwizard_s1.sh, train_torchrun_pixwizard_s2.sh, train_cluster_pixwizard_s1.sh, and train_cluster_pixwizard_s2.sh. I would like to know the differences among these four training files.
I would be grateful if you could inform me!
Hi, the difference between srun and torchrun is that srun relies on the SLURM system to allocate resources and launch parallel tasks in a cluster, typically used for institutionally deployed clusters, while torchrun is PyTorch’s distributed training launcher. The differences between s1 and s2 lie in their training data ratios, learning rates, and resolutions. You can refer to the training strategy described in our paper.