Zheng Guang Cong issues

Results 8 issues of


                                            Zheng Guang Cong

how many images are used in training

![image](https://user-images.githubusercontent.com/28867789/175860738-e11d8734-d93e-4d56-80d7-2286e1f5b640.png) 1. how many images are use in training? 40k or the updated full 118k? As shown in paper "High-Resolution Complex Scene Synthesis with Transformers", ![image](https://user-images.githubusercontent.com/28867789/175860915-cfc459c9-363e-4da5-aaca-af2e2861be5b.png) the number of training...

How to split coco-stuff val into 1024 val and 2048 test

Thanks for your impressive work! I have the following questions: 1. How to split coco-stuff val into 1024 val and 2048 test 2. when calculating FID, do you generate 2048x5...

How to implement on stable diffusion?

A really exciting work! I wonder if it could be implemented in stable diffusion.

The difference between Loss( f(x0+t_{n+1}z) , f(x0+t_{n}z) ) and Loss( f(x0+t_{n+1}*z), x0 )?

1. In CT, would it be acceptable to use Loss( f(x0+t_{n+1}*z), x0 ) in place of Loss( f(x0+t_{n+1}*z) , f(x0+t_{n}*z) ) ? 2. I would like to know if doing...

Layout2Image: Do you train on the deprecated coco-stuff 2017 segmentation challenge or the full coco-stuff 2017?

Thanks for your great work! SG2Im, Layout2Im, LostGAN, they train on the deprecated coco-stuff with 24,972 training images. Do you train on the deprecated coco-stuff 2017 segmentation challenge or the...

loss m_mse of MDT-S-2 is much larger than mse and the visualization of MDT-S-2 with mask_ratio 0.3 does not work

From the loss of mse and m_mse, it seems that the mask branch does not work in MDT-S-2. We also visualize the generation image and find that generated image with...

Reproduction issues for Wan2.1-T2V-1.3B

1. Could you share the hyperparameters or shell for sampling videos? 2. what is the precise setting of guidance_scale, flow_shift, num_inference_steps, sampler, checkpoint(diffusers or not), negative prompt? I tried to...

Any plan for training codes of cogvideo-i2v?

1. Lora finetuning 2. Full finetuning