pluo911

Results 5 comments of pluo911

In fc layer, IN and LN should be the same. R50v2+SN converges much faster than R50v1+SN and produces better top-5 acc.

@GYxiaOH Try batch average when evaluating BN in SN. Batch average is stable than moving average for BN. In some tasks there could be difference, please see figure 8 in...

Thanks for your interest. SN benefits from adding 0.5 dropout in the last layer of hidden features, but GN and BN might not. The improvement depends on the generation error...

@Latou GN can be included in SN. You may try GN in your problem.

@eugenelawrence We are planning to do this. Welcome to contribute.