diffusers SD3 - image_processor.py:111: RuntimeWarning: invalid value encountered in cast images = (images * 255).round().astype("uint8")

Describe the bug

Fails at late stage - after initial SD3 - possibly during VAE operation? Warning message is: /usr/local/lib/python3.10/dist-packages/diffusers/image_processor.py:111: RuntimeWarning: invalid value encountered in cast images = (images * 255).round().astype("uint8")

Black image is created.

Reproduction

import torch from diffusers import StableDiffusion3Pipeline, AutoencoderTiny

cat_prompt = "A cat holding a sign that says hello world" cat_file = "sd3_hello_world-quantized-T5.png"

pipe = StableDiffusion3Pipeline.from_single_file( "https://huggingface.co/stabilityai/stable-diffusion-3-medium/blob/main/sd3_medium_incl_clips_t5xxlfp8.safetensors", torch_dtype=torch.float16, ) pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesd3", torch_dtype=torch.float16) pipe.vae.config.shift_factor = 0.0

pipe.enable_model_cpu_offload()

image = pipe( prompt=cat_prompt, negative_prompt="", width=768, height=512, num_inference_steps=28, num_images_per_prompt=2, guidance_scale=7.0, ).images[0] image.save(cat_file)

Logs

python3  hug_test_txt2img_sd3_single_file.py 
Fetching 21 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 45053.90it/s]
Loading pipeline components...:  67%|████████████████████████████████████████████████████████████████▋                                | 6/9 [00:07<00:04,  1.39s/it]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:09<00:00,  1.08s/it]
The config attributes {'block_out_channels': [64, 64, 64, 64]} were passed to AutoencoderTiny, but are not expected and will be ignored. Please verify your config.json configuration file.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 28/28 [00:27<00:00,  1.01it/s]
/usr/local/lib/python3.10/dist-packages/diffusers/image_processor.py:111: RuntimeWarning: invalid value encountered in cast
  images = (images * 255).round().astype("uint8")

System Info

🤗 Diffusers version: 0.29.2
Platform: Linux-6.8.0-36-generic-x86_64-with-glibc2.35
Running on a notebook?: No
Running on Google Colab?: No
Python version: 3.10.12
PyTorch version (GPU?): 2.3.1+cu121 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Huggingface_hub version: 0.23.4
Transformers version: 4.41.2
Accelerate version: 0.31.0
PEFT version: 0.11.1
Bitsandbytes version: 0.43.1
Safetensors version: 0.4.3
xFormers version: 0.0.27+c52921e.d20240627
Accelerator: NVIDIA GeForce RTX 3060, 12288 MiB VRAM
Using GPU in script?: yes
Using distributed or parallel set-up in script?: no

Who can help?

No response

Jul 01 '24 19:07 zagglez

Thanks for reporting. Certain layers of the T5 encoder need to be kept in FP32, which doesn't seem to be happening with single file loading. This PR should fix the issue #8778.

You won't face this issue if you load the SD3 model using from_pretrained. An example snippet is in the model card here: https://huggingface.co/madebyollin/taesd3

Jul 03 '24 05:07 DN6

@zagglez This issue should be resolved now that #8778 has been merged. Can you try installing from main and running?

Jul 08 '24 05:07 DN6

I am running a stable diffusion model on google colab and facing the same issue. Black image is created with this warning.

Jul 12 '24 08:07 kinjal-1007

@kinjal-1007 You will have to install diffusers from main before running your code.

pip install git+https://github.com/huggingface/diffusers.git

Jul 17 '24 03:07 DN6

@DN6 Hello sir, I am using Google Colab, because my pc doesn't have the required GPU memory to use these models. Is there a way to resolve this issue on google colab?

Jul 18 '24 05:07 kinjal-1007

@DN6 Hello sir, I am using Google Colab, because my pc doesn't have the required GPU memory to use these models. Is there a way to resolve this issue on google colab?

I also have that error when trying to generate images, I am also using colab as a testing environment,

Reviewing the code, I see that it is part of the method:: VaeImageProcessor

/usr/local/lib/python3.10/dist-packages/torchsde/_brownian/brownian_interval.py:599: UserWarning: Should have ta>=t0 but got ta=0.0291748046875 and t0=0.029175. warnings.warn(f"Should have ta>=t0 but got ta={ta} and t0={self._start}.") /usr/local/lib/python3.10/dist-packages/diffusers/image_processor.py:111: RuntimeWarning: invalid value encountered in cast images = (images * 255).round().astype("uint8")

And searching a little I found that it happens when there are non-numeric values (NaN or infinity) in the images array.

Any solution you recommend to solve it?

Jul 30 '24 04:07 Eduardishion

@Eduardishion @kinjal-1007 This issue only affects single file model loading.

You can either try loading the model using from_pretrained

import torch
from diffusers import StableDiffusion3Pipeline
pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", torch_dtype=torch.float16)

Or install diffusers from source using the following command if you need to use single file loading.

pip install git+https://github.com/huggingface/diffusers.git

Jul 30 '24 05:07 DN6

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sep 14 '24 15:09 github-actions[bot]

Marking as resolved due to inactivity, and because I think it has been resolved based on the discussion above. If I'm mistaken, please feel free to re-open the issue and apologies for the inconvenience!

Nov 20 '24 02:11 a-r-r-o-w