LAVIS
LAVIS copied to clipboard
HF hub [blip_text_model] num_attention_heads is 8? [blip_vision_model] eps is 1e-5?
- i find in this repo https://github.com/salesforce/LAVIS/blob/main/lavis/configs/models/med_config.json num_attention_heads is 12. but in https://huggingface.co/Salesforce/blip-image-captioning-large/blob/main/config.json [blip_text_model] num_attention_heads is 8.
- and blip_vision_model's eps should be 1e-6. https://github.com/salesforce/LAVIS/blob/2b6c6caf223e1a9a5139842d3191cad4166466b8/lavis/models/vit.py#L209
@JunnYu thanks a lot for well spotting and raising the issue.
We'll coordinate with their team to update these configs on HF.
The implementation of BLIP models in LAVIS are better validated by us. You can rely on them as for now.