stdKonjac
stdKonjac
Recently I'm training a video model using torch DDP. I found it's okay to load video frames with OpenCV, but when I switch to decord, the training process will randomly...
Thanks for your great work. Following the official evaluation guide, I find it difficult to reproduce the results reported in your paper on VisDial. Could you please share the related...
Hi, I wonder what is the conv_mode for VILA1.5-40b in video inference? Additionally, I noted that the \ token seems invalid in video inference. The eval codes will automatically add...