Dao Minh Quan comments

Results 11 comments of


                                            Dao Minh Quan

When to Stop the Training for Unconditional Training on FFHQ Dataset

Instead of using pretrained vq-4 from latent repo, I used the KL-8 pretrained from stable diffusion and managed to reproduce the result (you could see the DiT repo for it)...

When to Stop the Training for Unconditional Training on FFHQ Dataset

Yeah, I used fixed learning rate 5e-5 and the Unet architecture is pretty the same, the only difference here is KL-8 downsample from 256 to 32 (instead of vq-4 from...

ser_fiq.py mxnet load error

I met the same problem while running SER-FQA, but on ubuntu. I'm not sure if it is the same on Window. The way I solved that problem is to install...

About training time. I found that it takes two hours to complete 4 epochs.

If you are using DiT architecture, you should install torch>=2.0, the flash attention will allow you train faster but will sacrifice some performance. Or you could run encoder on the...

Over 450 Generated Images. FID 271.254 . What's Wrong....

Hi, I'm understanding that you retrain our model and get 9.21. Is it correct ?

Over 450 Generated Images. FID 271.254 . What's Wrong....

Please note that: our stat file is computed using jpg images. If the generated image is png image, it leads to very high fid.

Over 450 Generated Images. FID 271.254 . What's Wrong....

I trained the model for 600 epochs and evaluate at 475 for CelebHQ-256

Over 450 Generated Images. FID 271.254 . What's Wrong....

Yes, the model seems unstable after 500 epoch. In our paper, we use Cosine Learning rate decay and it depends on the total epoch. To be more stable, we suggest...

Over 450 Generated Images. FID 271.254 . What's Wrong....

Yes, you should use DiT/train.py to revise my code. I found it is more easier and compact when following DiT repo.

Over 450 Generated Images. FID 271.254 . What's Wrong....

Yes, I think you run it correctly, I wonder what environment you use to run model. I found that the architecture is more stable with torch 1.x version. I retrained...