EfficientFormer
EfficientFormer copied to clipboard
Are there any the experiments comparing LN and BN?
In section 3, Observation 3 part.
" Based on our ablation study in Appendix Tab.3, CONV-BN only slightly downgrades performance compared to LN" .
However I found no LN related experiments in Appendix Tab.3. There are only BN and GN.
Thanks for pointing this out. We will revise the manuscript and add ablation results on LN.
Initially, GN with groups=1, which is the case in current ablation, is supposed to be equivalent to LN. But the authors of PoolFormer point out that instead it is not true in 4D case, since GN (4D) and LN (3D-ViT) normalize across different dimensions ([C, H, W] compared to [C]). Please also refer to their clarifications for details.