stablediffusion icon indicating copy to clipboard operation
stablediffusion copied to clipboard

RuntimeError: expected scalar type BFloat16 but found Float

Open picard314 opened this issue 2 years ago • 15 comments

Below is the log I have encountered at running "python scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt <path/to/768model.ckpt/> --config configs/stable-diffusion/v2-inference-v.yaml --H 768 --W 768"

Running DDIM Sampling with 50 timesteps DDIM Sampler: 0%| | 0/50 [00:00<?, ?it/s] data: 0%| | 0/1 [00:00<?, ?it/s] Sampling: 0%| | 0/3 [00:00<?, ?it/s] Traceback (most recent call last): File "scripts/txt2img.py", line 388, in main(opt) File "scripts/txt2img.py", line 347, in main samples, _ = sampler.sample(S=opt.steps, File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddim.py", line 104, in sample samples, intermediates = self.ddim_sampling(conditioning, size, File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddim.py", line 164, in ddim_sampling outs = self.p_sample_ddim(img, cond, ts, index=index, use_original_steps=ddim_use_original_steps, File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddim.py", line 212, in p_sample_ddim model_uncond, model_t = self.model.apply_model(x_in, t_in, c_in).chunk(2) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddpm.py", line 858, in apply_model x_recon = self.model(x_noisy, t, **cond) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/models/diffusion/ddpm.py", line 1335, in forward out = self.diffusion_model(x, t, context=cc) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward h = module(h, emb, context) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward x = layer(x, context) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/mnt/disk1/swh/git_sd/stablediffusion/ldm/modules/attention.py", line 327, in forward x = self.norm(x) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 272, in forward return F.group_norm( File "/root/miniconda3/envs/ldmsd/lib/python3.8/site-packages/torch/nn/functional.py", line 2516, in group_norm return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: expected scalar type BFloat16 but found Float

Please, anyone has met the same and had a solution?

picard314 avatar Apr 10 '23 09:04 picard314

have you solved the issue?

simonnxren avatar Apr 11 '23 03:04 simonnxren

have you solved the issue?

Yes I have. It is due to the incompatiblity of pytorch with cuda.

picard314 avatar Apr 11 '23 07:04 picard314

have you solved the issue?

Yes I have. It is due to the incompatiblity of pytorch with cuda.

I am facing the same issue myself. Is it incompatible with cuda et al, or a version of it? because I have a hard time imagining running it without using the gpu. how did you fix it?

adirz avatar Apr 11 '23 20:04 adirz

Embarrassingly, I have turned to use the gpu to circumvent such issue. The incompatiblity was in fact a problem I met when I used gpu. I have not seen into "using the cpu" any more but I guess changing torch version may help you @adirz .

@simonnxren Sorry for giving vague answer to you.

picard314 avatar Apr 12 '23 09:04 picard314

@picard314 I have run into this issue, but I was able to make adjustments so that the code runs, but it's using my CPU and not my NVIDIA GPU. I'm running CUDA 11.7 as that is what seemed to be the correct version. What CUDA version are you using, what all did you do to resolve this issue?

wobblytables avatar Apr 17 '23 11:04 wobblytables

@wobblytables mine is cuda 11.4

If for cuda 11.7, I think installation needs to be

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

picard314 avatar Apr 18 '23 04:04 picard314

Yes I have. It is due to the incompatiblity of pytorch with cuda

I had the same problem and solved it by setting up the gpu to run

lijain avatar Apr 20 '23 13:04 lijain

Yes I have. It is due to the incompatiblity of pytorch with cuda

I had the same problem and solved it by setting up the gpu to run

I met with the same problem.Are you mean to use methods like set CUDA_VISIBLE_DEVICES to set up the gpu?Thank you very much

yu-liu24 avatar Apr 23 '23 16:04 yu-liu24

@wobblytables mine is cuda 11.4

If for cuda 11.7, I think installation needs to be

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

if you don't mind, can I know your GPU name and which version of pytorch you used? I have geforce3060, and I used cuda 11.4, pytorch 1.12.1 but I met that error so I changed the cuda version to 11.6 but still have a same problem...

hotelbread avatar May 17 '23 01:05 hotelbread

Adding --device cuda worked for me.

It looks like a change setted CPU to be used by default https://github.com/Stability-AI/stablediffusion/pull/147/files#diff-048b7bba4049f97b2038502af5686b6c5f53a882ff02771fcb0d733d22a0ab6cR180-R186, I think it was messing up data types.

Mateusmsouza avatar May 22 '23 23:05 Mateusmsouza

Adding --device cuda worked for me.

It looks like a change setted CPU to be used by default https://github.com/Stability-AI/stablediffusion/pull/147/files#diff-048b7bba4049f97b2038502af5686b6c5f53a882ff02771fcb0d733d22a0ab6cR180-R186, I think it was messing up data types.

nice solution,it's worked for me too

order-a-lemonade avatar May 29 '23 04:05 order-a-lemonade

How do you fix this error when you actually want to run it on the CPU? I can't find a way to

asdfjkluiop avatar Jan 11 '24 03:01 asdfjkluiop

Is this going to get fixed?

I read the documentation, installed the requirements, and ran the example. It crashed with this error message.

That seems like a pretty critical bug, but it hasn't even been assigned to anyone yet after 9 months.

esiefker avatar Jan 26 '24 01:01 esiefker

as a hint, here is some description what might help: use "--precision full" (taken from here: https://huggingface.co/CompVis/stable-diffusion-v1-4/discussions/42) and in addition there are special configs for cpu processing in the "intel" folder of this repo. Currently I'm using the "-fp32" config in combination with the precision flag and it at least generates some images. I'm not sure what the root-cause really is as I'm no expert in this field, but this https://github.com/Stability-AI/stablediffusion/blob/main/ldm/modules/attention.py#L175 looks suspicious...

questor avatar Jan 26 '24 15:01 questor

i am having the same issue

!pip install torch==2.0.1 transformers datasets peft accelerate trl bitsandbytes optimum

when i try to load the X_IA3 adapters

SofiaBianchi123 avatar Apr 16 '24 23:04 SofiaBianchi123