diffusers Using StableDiffusionImg2ImgPipeline with prompt_embeds returns ValueError: `prompt` has to be of type `str` or `list` but is <class 'NoneType'>

Describe the bug

I'm trying to do inference with StableDiffusionImg2ImgPipeline using prompt embeddings rather than prompt as a string. I am using the Compel library to create the embeddings. This works well with StableDiffusionPipeline but returns the following error message with StableDiffusionImg2ImgPipeline.

ValueError: prompt has to be of type str or list but is <class 'NoneType'>

Reproduction

import torch
from diffusers import StableDiffusionImg2ImgPipeline
from compel import Compel

pipeline = StableDiffusionImg2ImgPipeline.from_pretrained('stabilityai/stable-diffusion-2-1-base', torch_dtype=torch.float16, revision='fp16')
pipeline = pipeline.to('cuda')

prompt = 'a fantasy landscape with a mountain in background'
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)
prompt_embeds = compel.build_conditioning_tensor(prompt)

result = pipeline(
    prompt_embeds=prompt_embeds,
    image='https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg',
    strength=0.6,
    num_inference_steps=20,
    guidance_scale=7,
    num_images_per_prompt=1
)

Logs

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py", line 623, in __call__
    self.check_inputs(prompt, strength, callback_steps, negative_prompt, prompt_embeds, negative_prompt_embeds)
  File "/usr/local/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py", line 432, in check_inputs
    raise ValueError(f"`prompt` has to be of type `str` or `list` but is {type(prompt)}")
ValueError: `prompt` has to be of type `str` or `list` but is <class 'NoneType'>

System Info

diffusers version: 0.12.1
Platform: Linux-5.4.72-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python version: 3.10.10
PyTorch version (GPU?): 1.13.1+cu116 (True)
Huggingface_hub version: 0.12.0
Transformers version: 0.15.0
Accelerate version: not installed
xFormers version: 0.0.16
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Feb 17 '23 07:02 alexisrolland

Could you first load the input image as a PIL Image and provide that while calling the pipeline?

Feb 20 '23 08:02 sayakpaul

Hi.

With https://github.com/huggingface/diffusers/pull/2423 merged, this issue should go away. I tested it with the Depth2Image pipeline and it seems to work: https://colab.research.google.com/gist/sayakpaul/0ad064c19ff7cce134028c38f55af81e/scratchpad.ipynb

Feb 20 '23 09:02 sayakpaul

Thanks @sayakpaul

Could you first load the input image as a PIL Image and provide that while calling the pipeline?

I have tried both PIL and Tensor, but I still get the same error.

The PR https://github.com/huggingface/diffusers/pull/2423 seems to be fixing the StableDiffusionDepth2ImgPipeline but will it also fix StableDiffusionImg2ImgPipeline?

Feb 20 '23 10:02 alexisrolland

Ah! Might not be. Do you mind opening a PR following the same?

Feb 20 '23 10:02 sayakpaul

It seems this has been corrected in the latest release 0.13.0

Feb 21 '23 06:02 alexisrolland

Thanks!

Mar 06 '23 11:03 patrickvonplaten