METER icon indicating copy to clipboard operation
METER copied to clipboard

METER: A Multimodal End-to-end TransformER Framework

Results 13 METER issues
Sort by recently updated
recently updated
newest added

Hi author, thanks for your work. Can you provide an example script for the image caption downstream task? Thanks.

Hi, I‘m trying to making "run.py" work for Pre-training, but I got ValueError and AttributeError, and I didn't find a solution, can you help me to check it? Thank you...

Hello, the author, great work! I'm curious whether you have tried to add Image Text Contrast Learning in the pretraining task? Because in the ALBEF paper, they reported that the...

Hi! Thanks for your great job! i tried to pretrain the model on multi node,multi gpus (8 * 8 gpus as vilt did), and observed a performance drop when finetune...

Hi, Thanks for the code! I wonder if you plan to release the pretrained weights of CLIP-ViT-224/32 (e.g., METER-CLIP32-RoBERTa (resolution: 224^2) pre-trained on GCC+SBU+COCO+VG)? It would be helpful for those...

Hi, I'm confused by the testing checkpoint in the downstream tasks. I wonder which checkpoint should I use to evaluate, the last ckpt or the saved top-1 on the val...

I used pl.seed_everything to set seed, ``` pl.seed_everything(_config["seed"], workers=True) ``` but I still got different result when I tested flickr30k Image2Text Retrieval task on the model trained by myself. First:...

Thanks for the amazing repository. The code is really clean. If I understand correctly, the current implementation is co-attention model, and same for pre-trained weights. I wanted to know if...

I have already run the evaluation task of IR/TR in coco dataset with your provided example in the cmd line. However, the returned values of IR R@1, R@5, R@10 and...

Hi authors, Thanks for your great work! I am trying to reproduce the results but found it is too slow for irtr testing. It seems that it needs 38 hours...