PyTorch-VAE icon indicating copy to clipboard operation
PyTorch-VAE copied to clipboard

Training VQVAE dose not convergence.

Open henanjun opened this issue 3 years ago • 6 comments

  • This is the reconstruction result of 100 Epoch of VQVAE.
    recons_VQVAE_Epoch_99

henanjun avatar Nov 29 '22 03:11 henanjun

I found that in vq_vae.yaml, the scheduler_gamma is set to be 0.0. This parameter controls the multiplicative factor in torch.optim.lr_scheduler.ExponentialLR, which makes the learning rate to be 0 after epoch 0. Do you think this is the reason?

blade-prayer avatar Dec 02 '22 08:12 blade-prayer

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

imskull avatar Dec 09 '22 11:12 imskull

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

Met the same problem. I found the loss always unreasonably high (~1.0e+6) and it might cause a gradient explosion. This one helps, thanks a lot.

xjtupanda avatar Mar 18 '23 02:03 xjtupanda

In my training process the loss is more unreasonable(up to 1.0e+26!). And it just fluctuates like a pendulum. This one really works, you are my god!!!

Changing "LR"(learning rate" from 0.005 to 0.001 helps.

ohhh-yang avatar Mar 27 '24 09:03 ohhh-yang