k-diffusion icon indicating copy to clipboard operation
k-diffusion copied to clipboard

Questions about soft-min-snr loss

Open hmicrobe opened this issue 1 year ago • 1 comments

Really nice work! When reading through the paper, I have some questions about the proposed soft-min-snr loss. Would appreciate your feedback on this.

  1. In eq (5) of the hourglass diffusion transformers, it's mentioned that c_out^{-2}(\sigma) is incorporated, however, based on the definition of c_out, eq (5) should be
min(SNR, \gamma) * (\sigma_data^2 + \sigma^2) / (\sigma_data^2 * \sigma^2).
  1. In the implementation: https://github.com/crowsonkb/k-diffusion/blob/6ab5146d4a5ef63901326489f31f1d8e7dd36b48/k_diffusion/layers.py#L64-L65

The \gamma=4 or 5 proposed in the paper doesn't seem to be used. Am I missing anything here?

hmicrobe avatar Feb 20 '24 07:02 hmicrobe

In this code, gamma is hardcoded to depend on sigma_data, with gamma being chosen as gamma = sigma_data^-2. This, combined with the preconditioner compensation, leads to the formula you're seeing.

stefan-baumann avatar Feb 20 '24 07:02 stefan-baumann