BLIP icon indicating copy to clipboard operation
BLIP copied to clipboard

Pretrained Network

Open 12sf12 opened this issue 3 years ago • 1 comments

Hi

Thanks for your outstanding work.

I faced an issue when I wanted to load one of the pretrained vit base with this URL: 'https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth'

in the state-dict, the model does not have 'visual_encoder.pos_embed'. Hence, it produces an error. For instance, the following code is not executable:

model_url='https://dl.fbaipublicfiles.com/deit/deit_base_patch16_224-b5f2ef4d.pth' model = blip_decoder(pretrained=model_url, image_size=224, vit='base')

Would it be possible to share with me the recent lightweight pretrained model, because this is only the issue with the model mentioned above.

Many Thanks.

12sf12 avatar Jul 28 '22 11:07 12sf12

Hi, my implementation of ViT is based on the timm codebase. You might want to try the pretrained weights from timm.

LiJunnan1992 avatar Aug 01 '22 08:08 LiJunnan1992