Swin optimization results are inconsistent when qk_scale is not default value.

Open cyfwry opened this issue 3 years ago • 0 comments

Description

branch: v5.0
gpu: T4

Reproduced Steps

1. clone and compile FasterTransformer
2. cd examples/pytorch/swin
3. modify QK_SCALE in Swin-Transformer-Quantization/SwinTransformer/configs/swin_tiny_patch4_window7_224.yaml, e.g. QK_SCALE=2.0
4. sh run_test.sh

Sep 20 '22 07:09 cyfwry