diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Add Ascend NPU support for SDXL and fix bugs

Open HelloWorldBeginner opened this issue 1 year ago • 1 comments

What does this PR do?

  1. Adds NPU flash attention support for NPU, similar to #7816.
  2. Fixes a bug related to saving the model when using deepspeed, also similar to #7816.
  3. Fixes an out-of-memory bug when handling data.

In the following code, data processing will occur on a single GPU or NPU. Assuming you have eight GPUs and are using gradient accumulation of 4, the batch size will be magnified by 8 * 4 times. so, processing data on a single GPU can easily lead to out-of-memory errors during data processing.

train_dataset_with_vae = train_dataset.map(
    compute_vae_encodings_fn,
    batched=True,
    batch_size=args.train_batch_size * accelerator.num_processes * args.gradient_accumulation_steps,
    new_fingerprint=new_fingerprint_for_vae,
)

I encountered this issue on an 8xA100 machine. As shown in the figure, the memory usage on the first GPU spikes significantly during data processing. However, actual training does not consume such high memory, especially when using distributed frameworks like DeepSpeed. image

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you read the contributor guideline?
  • [ ] Did you read our philosophy doc (important for complex PRs)?
  • [ ] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

  • Training examples: @sayakpaul

HelloWorldBeginner avatar May 11 '24 03:05 HelloWorldBeginner

Thanks very much for your PR. Could we maybe split the PR into related sections please?

  1. NPU related things
  2. DeepSpeed related things

Please LMK once done and we will take it from there.

sayakpaul avatar May 11 '24 07:05 sayakpaul

Thanks very much for your PR. Could we maybe split the PR into related sections please?

  1. NPU related things
  2. DeepSpeed related things

Please LMK once done and we will take it from there. Hi @sayakpaul, I split the PR into two parts https://github.com/huggingface/diffusers/pull/7916 https://github.com/huggingface/diffusers/pull/7917

HelloWorldBeginner avatar May 11 '24 07:05 HelloWorldBeginner

Thanks much. Let me take a look!

sayakpaul avatar May 11 '24 07:05 sayakpaul