Swin-Transformer
Swin-Transformer copied to clipboard
Corner patches high similarity
Hi authors,
I tried to compute the cosine similarity of the pretrained swin-tiny output without average pooling. (7x7 patch each with 1000 dim)
The visualization below shows the patches at the four corners are highly correlated. Input image is the shown at the end.
Is it due to the the 'relative position bias' or some special processing around the corner patches?
Could you help explain this?
