Add OFA captioning
Hi,
I have integrated a new captioning model(OFA) to the repo. Link to OFA model repo: https://github.com/OFA-Sys/OFA.git
Below is a quick comparison betwen OFA, BLIP and BLIP2:
BLIP: girl blowing a toy in the air as she stands near her
BLIP2 OPT_6.7B: a girl with red hair kissing a robot
OFA caption_huge_best: a girl with red hair holding a small robot to her face
OFA setting: num_beams: 3, max_len: 16, temperature: 0.5
Performance on my 1070 mobile:
- ~3GB for batch_size=1
- ~7.5GB for batch_size=20, max_data_loader_n_workers=4
- With the setting from 2, it runs at 12s/it, which converts to 0.15s per image
Install requirement: The code for OFA contains a custom version of fairseq, and need to build it from source. Is best to install build-essential before running pip install.
Thank you for this! OFA captioning seemed to be good!
However, it seems difficult to include all OFA codes in the repository for future code management. It would be nice if there is a simpler way to do this.
Also, our repository can process the caption (text) files if they are created by any process. So, I think it would be an idea for you to create your own independent repository.