stable-diffusion.cpp
stable-diffusion.cpp copied to clipboard
Metal text2img crashes
On my Mac Studio M2 Ultra, building and running with Metal always crashes. If I run using lldb then there is a chance that I get an output but it can still crash. I'm currently following this guide and using the default cat prompt on leejet's q4_k and q2_k flux schnell model. Same behaviour for his q2_k model. The guide's link to the vae safetensor is inaccessible for me as I'm not part of flux-dev but I used the official black-forest-labs vae matrix.
...
[INFO ] stable-diffusion.cpp:1236 - get_learned_condition completed, taking 3080 ms
[INFO ] stable-diffusion.cpp:1259 - sampling using Euler method
[INFO ] stable-diffusion.cpp:1263 - generating image: 1/1 - seed 42
[DEBUG] ggml_extend.hpp:980 - flux compute buffer size: 398.50 MB(VRAM)
zsh: segmentation fault ./build-metal/bin/sd --vae --clip_l --t5xxl -p --cfg-scale 1.0 euler -v
...
INFO ] stable-diffusion.cpp:1236 - get_learned_condition completed, taking 3084 ms
[INFO ] stable-diffusion.cpp:1259 - sampling using Euler method
[INFO ] stable-diffusion.cpp:1263 - generating image: 1/1 - seed 42
[DEBUG] ggml_extend.hpp:980 - flux compute buffer size: 398.50 MB(VRAM)
Process 41298 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=257, address=0x9b037b0376037003)
frame #0: 0x00007b0376037003
error: memory read failed for 0x7b0376037000
Target 0: (sd) stopped.
...
(lldb) fr v
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=257, address=0x9b037b0376037003)
* frame #0: 0x00007b0376037003
frame #1: 0x000000010015ae5c sd`ggml_metal_graph_compute(ctx=0x000000014080ea00, gf=<unavailable>) at ggml-metal.m:2870:36 [opt]
frame #2: 0x0000000100137ec4 sd`ggml_backend_graph_compute [inlined] ggml_backend_graph_compute_async(backend=0x0000600000d4c370, cgraph=<unavailable>) at ggml-backend.c:282:12 [opt]
frame #3: 0x0000000100137ebc sd`ggml_backend_graph_compute(backend=0x0000600000d4c370, cgraph=<unavailable>) at ggml-backend.c:276:28 [opt]
frame #4: 0x0000000100081078 sd`GGMLRunner::compute(this=0x000000013ff07960, get_graph=<unavailable>, n_threads=16, free_compute_buffer_immediately=false, output=0x000000016fdfd568, output_ctx=0x0000000000000000) at ggml_extend.hpp:1095:9 [opt]
...
Hope the stack trace helps in fixing this issue! Thanks for making SD run locally!
EDIT: More information:
- I'm on commit 8847114abfd900898e78d0257f5f9086f2473601
Date: Sun Aug 25 22:39:39 2024 +0800
fix: fix issue when applying lora
- I built stable-diffusion.cpp with:
cmake -G Ninja -DSD_METAL=ON -DCMAKE_BUILD_TYPE="RelWithDebInfo" .. && cmake --build .(default release and debug can also repro the crash) - Ran with the sample guide commands:
./bin/sd --vae ~/work/models/stable-diffusion/diffusion_pytorch_model.safetensors --clip_l ~/work/models/stable-diffusion/clip_l.safetensors --t5xxl ~/work/models/stable-diffusion/t5xxl_fp16.safetensors -p "a lovely cat holding a sign says 'flux.cpp'" --cfg-scale 1.0 --sampling-method euler -v --diffusion-model ~/work/models/stable-diffusion/flux1-schnell-q4_k.gguf - my machine: Mac Studio M2 Ultra with 24 CPU cores and 64GB unified ram on Sonoma 14.6.1 (23G93)