diffusers Xlabs controlnets do not work with FluxControlNetInpaintPipeline

Describe the bug

errors when Xlabs controlnets are used with FluxControlNetInpaintPipeline

Reproduction

controlnet = FluxControlNet.from_pretrained(“xlabs…”) pipe = FluxControlNetInpaintPipeline.from_pretrained(…, controlnet=controlnet)

image = pipe(…)

Logs

Given groups=1, weight size of [16,3,3,3], expected input [1,1,4096,64] to have 3 channels, but got 1 channels instead.


Switching controlnet to a previously supported one such as InstantX does not result in this issue. All inputs are confirmed to be (h, w, 3) before forward pass.

System Info

diffusers from source

Who can help?

@sayakpaul @ang

Oct 21 '24 17:10 neuron-party

thanks for the issue! are you on the main with most recent commit? with this PR it should work now https://github.com/huggingface/diffusers/pull/9687

Oct 21 '24 23:10 yiyixuxu

if not, please share your checkpoint and a complete reproducible script

Oct 21 '24 23:10 yiyixuxu

@yiyixuxu yes i am on the latest commit of diffusers. reproducing the issue is simple, just run a pipeline call with the appropriate parameters and images.

Oct 22 '24 00:10 neuron-party

I'm able to run this

import torch
from diffusers.utils import load_image
from diffusers import FluxControlNetModel
from diffusers.pipelines import FluxControlNetPipeline
from PIL import Image
import numpy as np

generator = torch.Generator(device="cuda").manual_seed(87544357)

controlnet = FluxControlNetModel.from_pretrained(
  "Xlabs-AI/flux-controlnet-canny-diffusers",
  torch_dtype=torch.bfloat16,
  use_safetensors=True,
)
pipe = FluxControlNetPipeline.from_pretrained(
  "black-forest-labs/FLUX.1-dev",
  controlnet=controlnet,
  torch_dtype=torch.bfloat16
)
pipe.to("cuda")

control_image = load_image("https://huggingface.co/Xlabs-AI/flux-controlnet-canny-diffusers/resolve/main/canny_example.png")
prompt = "handsome girl with rainbow hair, anime"

image = pipe(
    prompt,
    control_image=control_image,
    controlnet_conditioning_scale=0.7,
    num_inference_steps=25,
    guidance_scale=3.5,
    height=1024,
    width=768,
    generator=generator,
    num_images_per_prompt=1,
).images[0]

image.save("yiyi_test_3_out.png")

Oct 22 '24 00:10 yiyixuxu

ohh inpaint pipeline

Oct 22 '24 00:10 yiyixuxu

yes inpaint support is not added yet for xlab controlnet, would you be interested in helping us? just need to apply same change we applied in this PR https://github.com/huggingface/diffusers/pull/9687/files#diff-6ce8c2692e053a21fac56265bd1b4c911d4b2df7f432cb9e9b9fca015bec101b

Oct 22 '24 00:10 yiyixuxu

sure thing

Oct 22 '24 00:10 neuron-party

Hey, @neuron-party are you submitting the PR? I can take up the PR if you are not.

Oct 22 '24 05:10 charchit7

Hi guys - can this be assigned to me? New to contributing so would like to during Hacktoberfest

Oct 22 '24 15:10 SakshamDhawan

I think @neuron-party is already on this! do you want to work on img2img?

Oct 22 '24 20:10 yiyixuxu

Yup can do 👍🏽 will post there now - thanks!

Oct 22 '24 22:10 SakshamDhawan