flowplusplus icon indicating copy to clipboard operation
flowplusplus copied to clipboard

bits per dim got only 3.38 in both train and valid

Open ClaireTU opened this issue 3 years ago • 1 comments

Thanks for your code! but i got an issue when running it i set the parameters all the same as the paper and change the epoch to 400 times (same as the paper) but finally i got only 3.38bits/dim and can't get the performance as well as the paper(3.09bits/dim) i am wondering that do you know how can i fix it or how should i set the parameters? sorry for bothering and look forward to your reply!!

ClaireTU avatar Dec 08 '22 02:12 ClaireTU

Hi I am also looking into this model and I am wondering where did you find the hyperparameters for the original paper? I went through the tensorflow repo released by the author but noticed some differences between their hyperparameters and the ones used in this repo.

BTW just to give a data point as a reference, I ran the training with settings as-is except for the batch size=32 on 4 A100 GPUs and got similar bpd as yours... planning on switching to bs=64 and tf repo hyperparameters and give it another try.

TomPyonsuke avatar Dec 13 '22 08:12 TomPyonsuke