stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

Lora broken output

Open gartia opened this issue 1 year ago • 19 comments

On this build most of the lora's i have will cause a black image as an output. I'm not entirely sure why this would be the case, but the next build after that one spams the console with this for a large amount of tensors [ERROR] model.cpp:893 - skip tensor 'lora_unet_output_blocks_5_1_transformer_blocks_1_ff_net_2.alpha' with n_dims 0

gartia avatar Apr 30 '24 10:04 gartia

same here, I just commented the lines added on this commit and it works fine again

SenninOne avatar Apr 30 '24 18:04 SenninOne

Can you provide lora examples?

ring-c avatar May 02 '24 12:05 ring-c

Any LORA i've tried with sdxl atleast i don't have any base models downloaded. But you can try the sdxl lightning 8 step lora I did a simple prompt of just "mountains" This commit and before it result in: 6d16f68 This commit and the newest result in: 760cfaa

gartia avatar May 02 '24 13:05 gartia

Any LORA i've tried with sdxl atleast i don't have any base models downloaded. But you can try the sdxl lightning 8 step lora

Without fix:

lora.hpp:31 - loading LoRA from '[...]/sdxl_lightning_8step_lora.safetensors' model.cpp:1379 - loading tensors from [...]/sdxl_lightning_8step_lora.safetensors gen: [...]/ggml/src/ggml.c:2745: ggml_new_tensor_impl: Assertion `n_dims >= 1 && n_dims <= GGML_MAX_DIMS' failed. SIGABRT: abort

With fix I do get an image with broken colors (mostly black), but without fix there is no image at all.

ring-c avatar May 02 '24 13:05 ring-c

Wouldn't skipping n_dims of 0 skip scalars. I'm not extremely versed with machine learning, so i could be wrong

gartia avatar May 02 '24 13:05 gartia

Wouldn't skipping n_dims of 0 skip scalars. I'm not extremely versed with machine learning, so i could be wrong

Not really important here, because GGML will generate error, so we cant use n_dims = 0.

Strange you are getting results "before", are you sure you applying lora?

ring-c avatar May 02 '24 13:05 ring-c

Yeah they are definitely being applied correctly i tested with the python version and got the same result as with cpp version, after that check was added its only a black image with artifacts now

gartia avatar May 02 '24 13:05 gartia

If you want me to test this, I do need minimal example to reproduce your error:

  1. Model you are using (civitai link are OK)
  2. Your parameters, command you are using to generate test images
  3. Versions of sd.cpp you are using. Don`t assume, double check what are you running.

ring-c avatar May 02 '24 13:05 ring-c

For testing purposes i'm using this setup: Model Vae Lora ./sd.exe -v -m ./models/sd_xl_base_1.0.safetensors --vae './vae/sdxl.vae.safetensors' -s 1 -b 1 -W 1024 -H 1024 --lora-model-dir ./loras --cfg-scale 7 --steps 20 --clip-skip 2 -p "<lora:sdxl_lightning_8step_lora:1> mountains"

Versions i tested with 760cfaa-cuda & ce1bcc7-cuda & ce1bcc7-avx2 : output 6d16f68-cuda : output

All the above versions produce this without the lora output

gartia avatar May 02 '24 14:05 gartia

On master-6d16f68 I do still getting a GGML error.

ring-c avatar May 02 '24 15:05 ring-c

That's really odd, im running on windows with cuda are you on a different OS/hardware maybe?

gartia avatar May 02 '24 16:05 gartia

I am running sd.cpp as a lib, on linux/cuda. I do not think this is important, code is the same, as I can see.

The strange part is that you do not get error on earlier versions. My fix does not do anything special, just don`t safe to memory tensors which we cant use anyway. In case this fix is do something wrong - you get broken image (black stuff). In some tests it help to run loras without problems. But before fix you must get error message from GGML.

ring-c avatar May 02 '24 17:05 ring-c

On this build most of the lora's i have will cause a black image as an output. I'm not entirely sure why this would be the case, but the next build after that one spams the console with this for a large amount of tensors [ERROR] model.cpp:893 - skip tensor 'lora_unet_output_blocks_5_1_transformer_blocks_1_ff_net_2.alpha' with n_dims 0

I had the same issue and solved it by commenting out the changes in https://github.com/leejet/stable-diffusion.cpp/pull/233, I suspect this is due to the LoRA alpha scalars having a dimension of zero.

Edit: #263

grauho avatar May 12 '24 17:05 grauho

I still cant reproduce the error. @grauho can you provide you setup?

ring-c avatar May 13 '24 04:05 ring-c

I still cant reproduce the error. @grauho can you provide you setup?

./bin/sd --model ../../.models/sd1-5_PonyV6.safetensors --lora-model-dir ../../.loras/ --prompt "<lora:marblesh:0.8> a lovely cat"

CPU back-end without any extra compilation arguments, using most recent commit ce1bcc74a6bf. Happens with every model and lora combination I've looked at, both SD1.5 and SDXL. My fix addresses the problem by not doing the dims check for alpha tensors as those are expected to be scalars.

grauho avatar May 13 '24 10:05 grauho

Updates: So for most LoRAs skipping the "alpha" scalars doesn't seem to matter. That said, when using certain LoRAs, such as the SDXL 8-Step Lightning LoRA mentioned above in gartia's comment, it does make a difference. With the fix it works fine, without it garbled noise is produced instead. It's also worth noting that I don't get GGML errors either way.

You can find the LoRA here if you want to try to replicate my results and don't want to make a civitai account: https://huggingface.co/ByteDance/SDXL-Lightning

Here's the invocation I used, only changing if the fix was applied: ./bin/sd -v -m ../../.models/ponyDiffusionV6XL_v6StartWithThisOne.safetensors --vae ../../.models/pony_sdxl_vae.safetensors -s 1 -b 1 -W 1024 -H 1024 --lora-model-dir ../../.loras/ --cfg-scale 7 --steps 8 --clip-skip 2 -p "<lora:sdxl_lightning_8step_lora:1> mountains" --color --output foobar

grauho avatar May 13 '24 12:05 grauho

The issue arose due to my mistaken merge of https://github.com/leejet/stable-diffusion.cpp/pull/233. I believe it has been fixed in the latest commit.

leejet avatar May 14 '24 15:05 leejet

@grauho In Release mode, assertions are disabled during compilation, which is likely why you couldn't reproduce the error. I believe most people compile in Release mode, so they wouldn't encounter the issue.

assert(n_dims >= 1 && n_dims <= GGML_MAX_DIMS);

leejet avatar May 14 '24 15:05 leejet

The issue arose due to my mistaken merge of #233. I believe it has been fixed in the latest commit.

Doesn't seem to be fixed, at least for me. Currently testing on M2 Max with Metal enabled, built in Release mode from latest commit on master branch from may 14th, trying to use the LoRA example in the readme:

./sd -v -m "./v1-5-pruned-emaonly.f16.gguf" -W 512 -H 512 --lora-model-dir "./" --cfg-scale 1 --steps 5 -p "cat sitting in a box<lora:lcm-lora-sdv1-5:1>"

This just generates an all black image. Without the LoRA it works as expected.

EDIT: I did some additional testing and on Windows using a CPU only release build (no CUDA) I get the same all black output. On Windows using a CUDA release build, it crashes when it tries to load the LoRA weights. So it's not just a Mac thing at least. I'm happy to more testing, including on Linux. Please just let me know.

einsteinx2 avatar May 25 '24 16:05 einsteinx2