Bad result than A1111 sd webui
Describe the bug
Task: generate a new hairstyle for a photo with inpaint+controlnet+mask Bug: webui can generate satisfied images while diffusers pipeline can not
webui result:
diffuers result:
Reproduction
Pipeline: StableDiffusionControlNetInpaintPipeline
Controlnet: open pose controlnet
Pipeline Input: denoise_strength = 0.99
prompt: (black bob cut hair:1.2), (white backdrop:1.2), (extremely detailed and realistic hair:1.2), (high quality:1.2), (hires:1.2), (hyperrealistic:1.2), RAW photo negative prompt: (mutated hands:1.5), (poorly drawn hands:1.5), (hands:1.3), (nsfw:1.3), (nudity:1.3), cartoon, painting, illustration, (worst quality, low quality, normal quality:2), (deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime), text, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck
Input image:
Input mask image:
Input open pose is generated from:
controlnet_conditioning_scale=0.5
ControlnetWeight in webui=0.5
steps=30
scheduler: DPM++ 2M SDE Karras
guidance_scale=7.0
model: majicmixRealistic_v7.safetensors, https://civitai.com/models/43331/majicmix-realistic
diffusers code snippet:
open_pose_processor = OpenposeDetector.from_pretrained(self.pretrained_model_dir) controlnet_openpose = ControlNetModel.from_single_file( os.path.join(self.pretrained_model_dir, "control_v11p_sd15_openpose.pth"), config_file=os.path.join(self.pretrained_model_dir, "control_v11p_sd15_openpose.yaml"), local_files_only=True, torch_dtype=torch.float16)
config_file = os.path.join(self.pretrained_model_dir, "v1-inference.yaml") pipe = StableDiffusionControlNetInpaintPipeline.from_single_file( os.path.join(pretrained_model_dir, model), original_config_file=config_file, controlnet=controlnet, num_in_channels=4, safety_checker=None, use_safetensors=True, local_files_only=True, torch_dtype=torch.float16)
control_image_resolution = min(input_image.size) open_pose_image = self.open_pose_processor(input_image, include_body=True, include_hand=True, include_face=True, detect_resolution=control_image_resolution, image_resolution=control_image_resolution, ) pipe(prompt=text_prompt, negative_prompt=negative_prompt, num_images_per_prompt=1, num_inference_steps=30, generator=torch.manual_seed(0), image=input_image, control_image=open_pose_image, guidance_scale=7.0, controlnet_conditioning_scale=0.5, mask_image=mask_image, guess_mode=guess_mode, strength=denoise_strength, width=input_image.width, height=input_image.height, callback_on_step_end=callback.callback_on_step_end if callback else None ).images[0]
Logs
No response
System Info
-
diffusersversion: 0.27.2 - Platform: Linux-5.4.0-152-generic-x86_64-with-glibc2.35
- Python version: 3.11.8
- PyTorch version (GPU?): 2.2.1+cu121 (True)
- Huggingface_hub version: 0.22.1
- Transformers version: 4.39.1
- Accelerate version: 0.28.0
- xFormers version: 0.0.25
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
I will appreciate it very much would you provide a help about this question. @stevhliu @lllyasviel
I'll give it a test and try to understand what's making them different and get back to you.
At first glance, it looks as though prompt and negative_prompt are not weighted correctly. diffusers does not have official built-in prompt weighting, I would recommend taking a look at the Compel project.
Auto1111 has this weighting built-in directly, and that is likely a big contributing factor.
nice catch, that saved me to install auto1111. That's probably the difference and even if it isn't, I'll wait to see if @weiweiwang updates the issue for now.
We have a section on using compel with Diffusers here if it helps :)
@stevhliu @asomoza
@ghunkins
Thanks, my friends, the output become better with Compel✌🏻