The code without syncbn will collapse

Open guanfuchen opened this issue 5 years ago • 1 comments

I notice a paper "Momentum2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning". It is an interesting work.

The results using all BN will not collapse.

I doubt the results may come from all view L2Norm? Should we split two views for testing?

Jan 21 '21 12:01 guanfuchen

Try to increase the weight decay such as 5e-4 including the bn and bias. I have also tried to include the shufflingBN from MoCo which helps a lot. The paper you have mentioned adopted the weight decay of 1e-4 without lars.

Oct 31 '21 09:10 Hzzone