Asychronous multiple GPU multiple model training with shared replay buffer

Open richardrl opened this issue 3 years ago • 1 comments

Let's say we have multiple diffusion models, as in the Cascading diffusion model paper.

Is there an easy way to setup training such that each conditional model is trained simultaneously on different GPUs? What about with a shared replay buffer, that each conditional model can access?

Aug 02 '22 21:08 richardrl

Hi @richardrl! For now the easiest way to train a multi-stage pipeline is to run different training scripts for each stage (e.g. train a super-resolution diffusion model just on a dataset of LR-HR images), similar to Glide, Imagen, etc. The augmentations mentioned in the Cascaded Diffusion paper should give a nice boost too.

Aug 03 '22 14:08 anton-l

Closing now @richardrl - please ping us if your question hasn't been answered well enough in your opinion.

Sep 13 '22 15:09 patrickvonplaten