ExplainingAI
ExplainingAI
Got it. Regarding simple downscaling leading to loss of details, another thing you could try is instead of passing a downsampled version, pass normal (same size as original image) mask...
Thank you so much for your support :) A regular DCGAN discriminator maps inputs of say shape 256x256 to single scalar output, so in scenarios where you need to feed...
Hello @sunly92 , Thank you :) You are right regarding the scaling factor not present, but this scaling is only used by the authors for VAE and not VQVAE. You...
Hello @Aman-Khokhar18 , Can you let me know which specific part you are having trouble configuring. Ideally just updating the config with the right resolution and channels in https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/config/celebhq.yaml#L3-L4 should...
Hello @mdtayebadnan , For unconditional generation one should see decent face like outputs in 100 epochs with batch size of 16. While training for 200 epochs should further improve results...
Hello @wendeyy , I think you can use the code which does mask conditioned generation to perform super-resolution without requiring too many changes. So say you want to train a...
Yes, since this is latent diffusion model, we would need to train a VAE(but vae on celebhq should not require more than 4-5 epochs to get a decent result). I...
For autoencoder, there should be a folder created `vqvae_autoencoder_samples`(inside /mnt/StableDiffusion-PyTorch-main/celebhq), that would have reconstruction for images generated during training, just check the last image in that folder to see the...
Yes you can try removing the attention to see if that gets rid of the error. Whats the image size you are working with ? Adding few other things that...
Got it. Yeah try with a batch size of 1 first. If it works then you can train with gradient accumulation. But if that also fails then you would have...