trahman8
trahman8
Thanks for the notebook and I will look into it.
Thanks for uploading the code.
Hi, If I want to load pretrained weight for text to video generation and fine tuned the model for another dataset. I found there are two pretrained weight (1) vqgan...
Thanks for the guidelines. Another question to train text-video generation which data-loader did you use? tats/coinrun/coinrun_dataset.py OR tats/coinrun/coinrun_dataset_v2.py. When I tried to load coinrun_dataset.py I get "KeyError: 'ground'" error. Did...
When I fine-tune using new data loader, the loss is decreasing and acc1 and acc5 is quite good during training. But during inference, the generated video become worst.
1. We used one GPU which is A600. 2. Mugen is a large dataset that took longer to train. We used 3 epochs for the dataset. For other two we...
We re-used existing datasets. In the dataloader, we modify the ref texts. See ldm/data/flintstones_data.py, ldm/data/mugen_data.py and ldm/data/proro_data.py to generate ref text.
This file is the same as the original flintstones dataset.
I also tried to finetune internvideo2-stage2 model. But after fine-tuning performance decrease which should not. After fine-tune I get following results on MSRVTT dataset: V2t_r1 → 41.6 T2v_r1 → 42.7...