Swin-Transformer
Swin-Transformer copied to clipboard
DropPath: why scale_by_keep=True? eg: batch_id which not masked will multipy with 1./keep_prob, it may not make sense?
-
here is where it use drop path: https://github.com/microsoft/Swin-Transformer/blob/main/models/swin_transformer.py#L217
-
and here is timm implementation: https://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/drop.py#L173
Should it set scale_by_keep=False explicitly?