diffusers Unsuccessful cross-attention weight loading in Custom Diffusion

Describe the bug

If you have PEFT installed in your environment, then custom_diffusion will not successfully load the cross-attention parameter, leading to a poor generation result. Given the time cost of troubleshooting this issue, the documentation should state that the currently implemented code is incompatible with peft. The code that causes this problem is:

if not USE_PEFT_BACKEND:
            if _pipeline is not None:
                for _, component in _pipeline.components.items():
                    if isinstance(component, nn.Module) and hasattr(component, "_hf_hook"):
                        is_model_cpu_offload = isinstance(getattr(component, "_hf_hook"), CpuOffload)
                        is_sequential_cpu_offload = isinstance(getattr(component, "_hf_hook"), AlignDevicesHook)

                        logger.info(
                            "Accelerate hooks detected. Since you have called `load_lora_weights()`, the previous hooks will be first removed. Then the LoRA parameters will be loaded and the hooks will be applied again."
                        )
                        remove_hook_from_module(component, recurse=is_sequential_cpu_offload)

            # only custom diffusion needs to set attn processors
            if is_custom_diffusion:
                self.set_attn_processor(attn_processors)

            # set lora layers
            for target_module, lora_layer in lora_layers_list:
                target_module.set_lora_layer(lora_layer)

            self.to(dtype=self.dtype, device=self.device)

            # Offload back.
            if is_model_cpu_offload:
                _pipeline.enable_model_cpu_offload()
            elif is_sequential_cpu_offload:
                _pipeline.enable_sequential_cpu_offload()
            # Unsafe code />

Reproduction

pip install peft 

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export OUTPUT_DIR="./ckpt/cat"
export INSTANCE_DIR="./data/cat"

accelerate launch train_custom_diffusion.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="photo of a <new1> cat"  \
  --resolution=512  \
  --train_batch_size=2  \
  --learning_rate=1e-5  \
  --lr_warmup_steps=0 \
  --max_train_steps=1000 \
  --scale_lr \
  --hflip  \
  --modifier_token "<new1>" \
  --no_safe_serialization \
  --validation_steps=50 \
  --validation_prompt="<new1> cat sitting in a bucket" \
  --report_to="wandb" 


import torch
from diffusers import DiffusionPipeline

pipeline = DiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16,
).to("cuda")
pipeline.unet.load_attn_procs("./ckpt/cat", weight_name="pytorch_custom_diffusion_weights.bin")
pipeline.load_textual_inversion("./ckpt/cat", weight_name="<new1>.bin")

image = pipeline(
    "<new1> cat sitting in a bucket",
    num_inference_steps=100,
    guidance_scale=6.0,
    eta=1.0,
).images[0]
image.save("cat.png")

Logs

No response

System Info

torch 2.2.1
diffusers 0.27.0.dev0
peft 0.9.0
transformers 4.38.2

Who can help?

@sayakpaul @yiyixuxu @DN6

Mar 09 '24 12:03 Rbrq03

Thanks for such a detailed issue. Would you want to open a PR for fix?

Mar 09 '24 13:03 sayakpaul

I open a PR for fix this problem. In my local test, cross-attention weights can be loaded successfully in my local test

Mar 11 '24 06:03 Rbrq03

Same problem

Mar 11 '24 07:03 daeunni

@daeunni you can try one of these methods:

simply uninstall PEFT if you don't use lora.
modify code in diffusers/loaders/unets.py as my PR do

I hope it will solve your problem

Mar 11 '24 08:03 Rbrq03

It works! u r the best :-) This PR needs to be accepted.

Mar 12 '24 04:03 daeunni

I think uninstalling PEFT is necessary for that step. (at least my environment)

Mar 12 '24 04:03 daeunni

Glad to hear that :)

Mar 12 '24 06:03 Rbrq03

@Rbrq03 Thanks! But friend, have you ever used attend-and-excite with Custom Diffusion? I load the trained pytorch_custom_diffusion_weights.bin using StableDiffusionAttendAndExcitePipeline, however, loading is failed. I am wondering is it the same cause with this issue.

This is my code with attend-and-excite.

pipeline = StableDiffusionAttendAndExcitePipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16,
).to("cuda")

cur_ckpt = '/d1/daeun/diffusers/examples/custom_diffusion/exp/cyw_step300'
pipeline.unet.load_attn_procs(cur_ckpt, #+ '/checkpoint-300', 
                                # weight_name = 'model.safetensors'
                                weight_name="pytorch_custom_diffusion_weights.bin"
                                )
pipeline.load_textual_inversion(cur_ckpt + "/<new1>.bin", weight_name="<new1>.bin")
pipeline.load_textual_inversion(cur_ckpt + "/<new2>.bin", weight_name="<new2>.bin")
pipeline.load_textual_inversion(cur_ckpt + "/<new3>.bin", weight_name="<new3>.bin")
pipeline.load_textual_inversion(cur_ckpt + "/<new4>.bin", weight_name="<new4>.bin")
pipeline.load_textual_inversion(cur_ckpt + "/<new5>.bin", weight_name="<new5>.bin")

pipeline.scheduler = DDPMScheduler.from_config(pipeline.scheduler.config)
generator = torch.Generator(device="cuda")
prompt = "a <new3> cat playing with a <new2> duck toy in the grassland"
image = pipeline(
    prompt,
    token_indices=[3, 8], #token_indices,
    num_inference_steps=100,
    guidance_scale=6.0,
    generator = generator,  
    eta=1.0
).images[0]

Mar 12 '24 07:03 daeunni

This is a separate issue for which you should open a new thread.

Mar 12 '24 07:03 sayakpaul

Okay, I will. Thanks

Mar 12 '24 07:03 daeunni

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Apr 08 '24 15:04 github-actions[bot]

Closing because of inactivity.

Jun 30 '24 05:06 sayakpaul