Swin-Transformer
Swin-Transformer copied to clipboard
Qs about Swin-moe
When I test the trained swin-moe on multiple gpus, the performance of each process is different. I loaded the weights for each process {t} with rank{t}.