OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

SpeeD Timestep Sampling

Open Koratahiu opened this issue 2 months ago • 4 comments

This PR implements the timestep sampling method from: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training.

Claims 3x faster pretraining at same quality:

image

Usage

  • Set timestep distribution to SPEED

⚠️ Notes

  • Validated for pretraining only; finetuning impact unknown, but the core concept may apply.

TODO

  • [ ] To be tested
  • [ ] Minimal change: The current approach modifies _get_timestep_discrete and requires betas/sigmas, which is not ideal.

Koratahiu avatar Nov 14 '25 05:11 Koratahiu

Should I expect better quality for the same steps amount for fine-tuning? Or, what should I pay attention to in order to test it?

miasik avatar Nov 14 '25 07:11 miasik

@Koratahiu The image you included is 404.

O-J1 avatar Nov 14 '25 08:11 O-J1

I usually use "debiased estimation" as loss weight function. Should I set it to constant for using SpeeD?

miasik avatar Nov 14 '25 08:11 miasik

Should I expect better quality for the same steps amount for fine-tuning? Or, what should I pay attention to in order to test it?

Yeah, if it works, then it should converge faster in the same number of steps.

I usually use "debiased estimation" as loss weight function. Should I set it to constant for using SpeeD?

The paper mentions that it’s compatible with loss weight functions (e.g., p2, min-SNR, debiased estimation, etc.), and in their official repo, they set the loss weight function to p2 by default.

Koratahiu avatar Nov 14 '25 11:11 Koratahiu