freeIsa
freeIsa
Hi there, I am aware that Virtex used image captioning as a pretraining task and not as the "final goal", but I was wondering whether one could go on fine-tuning...
Hi everyone, I am experimenting with different ways to sample a fixed number of frames from a video. At the same time, I would also need to get the timestamps...
Hello, I am currently exploring the task of image captioning and I'd like to understand whether/how pretrained LXMERT could be used for such task. As a first attempt, I extracted...