Training custom dataset on the Vilt VQA finetuning

Open seifmaged31 opened this issue 3 years ago • 0 comments

Hi, I have seen what you have done in the tutorial of training Vilt on the VQA-V2 dataset and how you constructed the dataset and the data loader. However, I am trying to apply the same, but on my own dataset, which has 6 q&a pairs for each image. I am just a beginner so I am wondering what to take into consideration when following the same steps.

Note: of course it's not the same corpus, but most of the answers are one word and some contain 2-3 words, is it still valid to follow the same steps ?

Thanks in advance, your help will be appreciated.

May 09 '22 23:05 seifmaged31