diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

'StableDiffusionXLPipeline' object has no attribute 'named_parameters' when using accelerate.load_checkpoint_and_dispatch

Open thiagocrepaldi opened this issue 1 year ago • 4 comments

Describe the bug

How can I make the stable diffusion xl model to work with the init_empty_weights API? I've followed https://huggingface.co/docs/accelerate/v0.11.0/en/big_modeling but still got an unexpected error:

Reproduction

This is what I got so far:

import torch
from diffusers import DiffusionPipeline
from accelerate import init_empty_weights
from accelerate import load_checkpoint_and_dispatch


model_id = "stabilityai/stable-diffusion-xl-base-1.0"
with init_empty_weights():
    model = DiffusionPipeline.from_pretrained(model_id, low_cpu_mem_usage=False, use_safetensors=True)
model = load_checkpoint_and_dispatch(
    model, checkpoint=model_id, device_map="auto"
)

Logs

but it errors out with


Traceback (most recent call last):
  File "/opt/pytorch/test_sdxl_export_hf.py", line 10, in <module>
    model = load_checkpoint_and_dispatch(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/big_modeling.py", line 567, in load_checkpoint_and_dispatch
    max_memory = get_balanced_memory(
                 ^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 946, in get_balanced_memory
    module_sizes = compute_module_sizes(model, dtype=dtype, special_dtypes=special_dtypes)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 701, in compute_module_sizes
    for name, tensor in named_module_tensors(model, recurse=True):
  File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 475, in named_module_tensors
    for named_parameter in module.named_parameters(recurse=recurse):
                           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/diffusers/src/diffusers/configuration_utils.py", line 142, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'StableDiffusionXLPipeline' object has no attribute 'named_parameters'

System Info

  • diffusers version: 0.27.0.dev0
  • Platform: Linux-6.5.0-21-generic-x86_64-with-glibc2.31
  • Python version: 3.11.5
  • PyTorch version (GPU?): 2.3.0a0+gitb4b1480 (True)
  • Huggingface_hub version: 0.20.3
  • Transformers version: 4.38.1
  • Accelerate version: 0.27.2
  • xFormers version: not installed
  • Using GPU in script?: no
  • Using distributed or parallel set-up in script?: no

Who can help?

No response

thiagocrepaldi avatar Mar 04 '24 16:03 thiagocrepaldi

DiffusionPipeline is not an nn.Module and it won't be an nn.Module as it's supposed to be ever used during inference.

There are components of a DiffusionPipeline that are of type nn.module.

When you instantiate any DiffusionPipeline using from_pretrained(), we default to low_cpu_mem_usage=True. This means all the pipeline components that are of type nn.Module are first initialized under the init_empty_weights context saving random initialization time, and then the pre-trained weights are directly attached to the respective model parameters.

For more details, you should refer to: https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/pipeline_utils.py.

sayakpaul avatar Mar 05 '24 11:03 sayakpaul

Thanks @sayakpaul

Does this mean the load_checkpoint_and_dispatch API is not compatible with HF's model that are loaded using from_pretrained when low_cpu_mem_usage=False in general or this is a SDXL limitation?

Looking at Handling big models I saw an example using the .from_pretrained API (see below) for another model as shown below.

from accelerate import init_empty_weights
from transformers import AutoConfig, AutoModelForCausalLM

checkpoint = "EleutherAI/gpt-j-6B"
config = AutoConfig.from_pretrained(checkpoint)

with init_empty_weights():
    model = AutoModelForCausalLM.from_config(config)

Is there a way we can load SDXL with the big model APIS (aka init_empty_weights, load_checkpoint_and_dispatch, etc) and not use low_cpu_mem_usage=False?

For additional context, you can refer to https://github.com/huggingface/accelerate/issues/2494

thiagocrepaldi avatar Mar 05 '24 18:03 thiagocrepaldi

Does this mean the load_checkpoint_and_dispatch API is not compatible with HF's model that are loaded using from_pretrained when low_cpu_mem_usage=False in general or this is a SDXL limitation?

I think there's a misunderstanding. I already mentioned that DiffusionPipeline is not an nn.Module and that is the reason why it's not supported. Also, I find it slightly misleading to refer to this as "HF's model" as models from the transformers library are also from this category and unlike DiffusionPipelines, they are instances of nn.Module.

Is there a way we can load SDXL with the big model APIS (aka init_empty_weights, load_checkpoint_and_dispatch, etc) and not use low_cpu_mem_usage=False?

You will have to rejig the pipeline code yourself for that:

  • Load the individual components of a pipeline under the init_empty_weights context.
  • Then fetch the state dict of the individual components and leverage load_checkpoint_and_dispatch.

Internally within pipeline_utils.py we already do that.

sayakpaul avatar Mar 05 '24 18:03 sayakpaul

agree with @sayakpaul here

I think this is already using init_empty_weights if you this

from diffusers import DiffusionPipeline

model_id = "stabilityai/stable-diffusion-xl-base-1.0"
model = DiffusionPipeline.from_pretrained(model_id)

yiyixuxu avatar Mar 09 '24 18:03 yiyixuxu

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 04 '24 15:04 github-actions[bot]