Mark Goldstein
Mark Goldstein
BPD stands for bits per dimension, which is a type of log likelihood that is put into base-2 and standardized by the number of data dimensions.
Hi, The lines of code that you mention only happen during sampling, not training. During training, the label y is sometimes masked. During sampling, for CFG, people combine model(xt, t,...
I also reached out to the imagenet moderators to hear their input and will post any response here.
@Kim-Dongjun provided a good explanation and shared the location of the torrent that people use for the original data from pixel rnn. Here is Dongjun's explanation of the discrepancy (which...
Here is a summary. For imagenet 32x32, some papers use an "old" version and some use a "new" version. My understanding is: - the "new" one is the one current...