'StableDiffusionXLPipeline' object has no attribute 'named_parameters' when using accelerate.load_checkpoint_and_dispatch
Describe the bug
How can I make the stable diffusion xl model to work with the init_empty_weights API? I've followed https://huggingface.co/docs/accelerate/v0.11.0/en/big_modeling but still got an unexpected error:
Reproduction
This is what I got so far:
import torch
from diffusers import DiffusionPipeline
from accelerate import init_empty_weights
from accelerate import load_checkpoint_and_dispatch
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
with init_empty_weights():
model = DiffusionPipeline.from_pretrained(model_id, low_cpu_mem_usage=False, use_safetensors=True)
model = load_checkpoint_and_dispatch(
model, checkpoint=model_id, device_map="auto"
)
Logs
but it errors out with
Traceback (most recent call last):
File "/opt/pytorch/test_sdxl_export_hf.py", line 10, in <module>
model = load_checkpoint_and_dispatch(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/big_modeling.py", line 567, in load_checkpoint_and_dispatch
max_memory = get_balanced_memory(
^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 946, in get_balanced_memory
module_sizes = compute_module_sizes(model, dtype=dtype, special_dtypes=special_dtypes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 701, in compute_module_sizes
for name, tensor in named_module_tensors(model, recurse=True):
File "/opt/conda/envs/ptca/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 475, in named_module_tensors
for named_parameter in module.named_parameters(recurse=recurse):
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/diffusers/src/diffusers/configuration_utils.py", line 142, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'StableDiffusionXLPipeline' object has no attribute 'named_parameters'
System Info
-
diffusersversion: 0.27.0.dev0 - Platform: Linux-6.5.0-21-generic-x86_64-with-glibc2.31
- Python version: 3.11.5
- PyTorch version (GPU?): 2.3.0a0+gitb4b1480 (True)
- Huggingface_hub version: 0.20.3
- Transformers version: 4.38.1
- Accelerate version: 0.27.2
- xFormers version: not installed
- Using GPU in script?: no
- Using distributed or parallel set-up in script?: no
Who can help?
No response
DiffusionPipeline is not an nn.Module and it won't be an nn.Module as it's supposed to be ever used during inference.
There are components of a DiffusionPipeline that are of type nn.module.
When you instantiate any DiffusionPipeline using from_pretrained(), we default to low_cpu_mem_usage=True. This means all the pipeline components that are of type nn.Module are first initialized under the init_empty_weights context saving random initialization time, and then the pre-trained weights are directly attached to the respective model parameters.
For more details, you should refer to: https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/pipeline_utils.py.
Thanks @sayakpaul
Does this mean the load_checkpoint_and_dispatch API is not compatible with HF's model that are loaded using from_pretrained when low_cpu_mem_usage=False in general or this is a SDXL limitation?
Looking at Handling big models I saw an example using the .from_pretrained API (see below) for another model as shown below.
from accelerate import init_empty_weights
from transformers import AutoConfig, AutoModelForCausalLM
checkpoint = "EleutherAI/gpt-j-6B"
config = AutoConfig.from_pretrained(checkpoint)
with init_empty_weights():
model = AutoModelForCausalLM.from_config(config)
Is there a way we can load SDXL with the big model APIS (aka init_empty_weights, load_checkpoint_and_dispatch, etc) and not use low_cpu_mem_usage=False?
For additional context, you can refer to https://github.com/huggingface/accelerate/issues/2494
Does this mean the load_checkpoint_and_dispatch API is not compatible with HF's model that are loaded using from_pretrained when low_cpu_mem_usage=False in general or this is a SDXL limitation?
I think there's a misunderstanding. I already mentioned that DiffusionPipeline is not an nn.Module and that is the reason why it's not supported. Also, I find it slightly misleading to refer to this as "HF's model" as models from the transformers library are also from this category and unlike DiffusionPipelines, they are instances of nn.Module.
Is there a way we can load SDXL with the big model APIS (aka init_empty_weights, load_checkpoint_and_dispatch, etc) and not use low_cpu_mem_usage=False?
You will have to rejig the pipeline code yourself for that:
- Load the individual components of a pipeline under the
init_empty_weightscontext. - Then fetch the state dict of the individual components and leverage
load_checkpoint_and_dispatch.
Internally within pipeline_utils.py we already do that.
agree with @sayakpaul here
I think this is already using init_empty_weights if you this
from diffusers import DiffusionPipeline
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
model = DiffusionPipeline.from_pretrained(model_id)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.