Geometric interpolation ViT
Hi,
In the interpolation for ViT, you have: https://github.com/microsoft/SimMIM/blob/bec329f54c2f84db16974035c3f010a88f1b3eb0/utils.py#L226
However, the keys are from: https://github.com/microsoft/SimMIM/blob/bec329f54c2f84db16974035c3f010a88f1b3eb0/utils.py#L218
which includes: https://github.com/microsoft/SimMIM/blob/bec329f54c2f84db16974035c3f010a88f1b3eb0/utils.py#L213-L215
However, these keys are not present in the model, and give an error.
I want to use the ViT base models for downstream tasks.
Could you please tell me if I am missing something in this?
Thanks.
I figured that the function has been written keeping the classification finetuning in mind. Hence you have:
use_abs_pos_emb=False, use_rel_pos_bias=True use_shared_rel_pos_bias=False
as settings from the simmim_finetune__vit_base__img224__800ep.yaml.
However for a dataset where the image is not a square, it doesn't work: https://github.com/microsoft/SimMIM/blob/bec329f54c2f84db16974035c3f010a88f1b3eb0/utils.py#L228-L229
Would you have any idea on how to go about this?