yangzy_thu

Results 19 comments of yangzy_thu

@ShaoNeilz are you solved it? I have the same trouble.

@xiaocaijizzz More detailed documentation is here : 'https://sleepychord.github.io/cogdata/build/html/index.html'. For example, you can use '--data_format TarDataset --data_files path_to_your_tar', or '--data_format ZipDataset --data_files path_to_your_zip' while creating dataset. Images in zip are like...

We follow magvit-v2 (https://arxiv.org/html/2310.05737v2). 4x+1 enable joint training with images and videos

This part means the model parallel in transformers and the context parallel in VAE use the same communication group. The current open-source code transformer part does not support context parallel.

CogVideoX use the same noise level as stable video diffusion. Dynamics won't be a problem

```python from openai import OpenAI prefix =''' **Objective**: **Give a highly descriptive video caption based on input image and user input. **. As an expert, delve deep into the image...

We used approximately 1/10 of the gpu hours for i2v fine-tuning, but a similar performance can be achieved with less GPU hours

Temporal compression by 8x can result in significant ghosting artifacts, which cannot be reflected in the evaluation metrics.

先确认下数据读取没有卡住训练?