trahman8 comments

Results 9 comments of


                                            trahman8

Script for training condition on text

Thanks for the notebook and I will look into it.

Script for training condition on text

Thanks for uploading the code.

Script for training condition on text

Hi, If I want to load pretrained weight for text to video generation and fine tuned the model for another dataset. I found there are two pretrained weight (1) vqgan...

Script for training condition on text

Thanks for the guidelines. Another question to train text-video generation which data-loader did you use? tats/coinrun/coinrun_dataset.py OR tats/coinrun/coinrun_dataset_v2.py. When I tried to load coinrun_dataset.py I get "KeyError: 'ground'" error. Did...

Script for training condition on text

When I fine-tune using new data loader, the loss is decreasing and acc1 and acc5 is quite good during training. But during inference, the generated video become worst.

How much VRAM is needed for this?

1. We used one GPU which is A600. 2. Mugen is a large dataset that took longer to train. We used 3 epochs for the dataset. For other two we...

Extended dataset

We re-used existing datasets. In the dataloader, we modify the ref texts. See ldm/data/flintstones_data.py, ldm/data/mugen_data.py and ldm/data/proro_data.py to generate ref text.

train-val-test_split.json missing

This file is the same as the original flintstones dataset.

Request for finetuned InternVideo2-1B results on video retrieval benchmarks

I also tried to finetune internvideo2-stage2 model. But after fine-tuning performance decrease which should not. After fine-tune I get following results on MSRVTT dataset: V2t_r1 → 41.6 T2v_r1 → 42.7...