Benno Krojer
Benno Krojer
Hi, thank you for your quick response! I found that there are three modes but the checkpoint "refcoco.pth" we are loading was finetuned with `models/model_retrieval.py`, according to the paper. And...
I replaced the forward() method in the visualization to use "text" and then "fusion", with the hope and intuition it should improve or change something. But the visualizations outputs stayed...
Same, the only solution is running "pkill wandb-service" from another terminal which unfortunately also stops all your other runs currently running...
I am having the exact same issue right now! Will do the test you mentioned for confection
I just finished downloading and got more than 90% of the videos. I am not sure what is happening on your end.
Hi, I don't think I have the dataset still on my computer. I don't have the bandwidth right now to download it and then upload it for others, sorry!
Hi, thank you for your reply! It is still not clear to me how exactly Egocentric Navigation for example was created. How did you know when exactly to stop the...
Thank you! I will try that out
@Andy1621 In the paper you say: "“VideoChat2text” denotes the model receiving blank videos and excludes LoRA tuning, relying solely on the LLM’s capacity for responses". Does that mean you ran...
I ran both Stage 2 and Stage 3 models in our pipeline, but with zero-out video input, but the results look quite different to the paper unfortunately. We were able...