Pretrained Network

Open 12sf12 opened this issue 3 years ago • 1 comments

Thanks for your outstanding work.

I faced an issue when I wanted to load one of the pretrained vit base with this URL: 'https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth'

in the state-dict, the model does not have 'visual_encoder.pos_embed'. Hence, it produces an error. For instance, the following code is not executable:

model_url='https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth' model = blip_decoder(pretrained=model_url, image_size=224, vit='base')

Would it be possible to share with me the recent lightweight pretrained model, because this is only the issue with the model mentioned above.

Many Thanks.

Jul 28 '22 11:07 12sf12

Hi, my implementation of ViT is based on the timm codebase. You might want to try the pretrained weights from timm.

Aug 01 '22 08:08 LiJunnan1992