diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Images are the wrong size (256px for stable diffusion 1.5) after using ckpt to diffusers script

Open JohnnyRacer opened this issue 3 years ago • 3 comments

Describe the bug

Hello, I was testing the conversion script for ckpts to diffusers, when I realized all the images that generated by the converted models are only 256x256 and of very low quality. I used the following commands to convert the ckpt :

python convert_original_stable_diffusion_to_diffusers.py --checkpoint_path ./models/sd-v1-5_pruned_ema.ckpt  --original_config_file ./stable-diffusion/configs/stable-diffusion/v1-inference.yaml  --dump_path sd-v1-5-diffusers

And loaded the checkpoint using the folder's path with StableDiffusionPipeline like so:

from diffusers import StableDiffusionPipeline, EulerAncestralDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("./sd-v1-5-diffusers", torch_dtype=torch.float16, safety_checker=None).to("cuda")

prompt ="a lion roaring"
torch.manual_seed(0)
image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]
image

The end result is not so nice ... Where both the resolution is capped at 256x256 and the content is not what was prompted. image

The ckpt model was download from HF's runwayml/stable-diffusion-v1-5 repo, the hash was verified and works well with Automatic1111's webui repo. Any help would be much appreciated.

Reproduction

wget https://raw.githubusercontent.com/huggingface/diffusers/039958eae55ff0700cfb42a7e72739575ab341f1/scripts/convert_original_stable_diffusion_to_diffusers.py
python convert_original_stable_diffusion_to_diffusers.py --checkpoint_path ./models/sd-v1-5_pruned_ema.ckpt  --original_config_file ./stable-diffusion/configs/stable-diffusion/v1-inference.yaml  --dump_path sd-v1-5-diffusers
from diffusers import StableDiffusionPipeline, EulerAncestralDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("./sd-v1-5-diffusers", torch_dtype=torch.float16, safety_checker=None).to("cuda")

prompt ="a lion roaring"
torch.manual_seed(0)
image = pipe(prompt, num_inference_steps=50, guidance_scale=7).images[0]
image

Logs

No response

System Info

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

  • diffusers version: 0.10.2
  • Platform: Linux-5.4.0-135-generic-x86_64-with-glibc2.17
  • Python version: 3.8.15
  • PyTorch version (GPU?): 1.12.1+cu113 (True)
  • Huggingface_hub version: 0.11.1
  • Transformers version: 4.25.1
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

JohnnyRacer avatar Dec 12 '22 20:12 JohnnyRacer

Thanks for the repro @JohnnyRacer ! Checking now :-)

patrickvonplaten avatar Dec 16 '22 14:12 patrickvonplaten

@patrickvonplaten I tried to use the latest conversion script and that seemed to solve the problem. Do you want to still look into the problem or should I close the issue?

JohnnyRacer avatar Dec 16 '22 21:12 JohnnyRacer

Ah I'm sorry that I didn't manage to follow up here. All good then, if it's solved for you :-)

patrickvonplaten avatar Dec 20 '22 00:12 patrickvonplaten

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 13 '23 15:01 github-actions[bot]