ConvMAE
ConvMAE copied to clipboard
Visualization VIT feature
Hi, author.
To visualize your results attention map, how can you visualize this?
- Use Encoder (ViT)?
- Use Decoder (VIT)?
given input x -> y = encoder(x) -> decoder(y). then use final vit of decoder(y)?