Memory problem

Open GlacierMelt opened this issue 6 years ago • 0 comments

In distributed training, the memory of the first GPU is twice that of the other.But before the swa is applied, the GPU memory is the same.

Oct 04 '19 13:10 GlacierMelt