ml-stable-diffusion stabilityai/stable-diffusion-2-1: Errors generating images with Swift Package

My machine is a MacBook Air M1 with 16GB ram

When using Stable Diffusion 2.1 from stabilityai, I get the following error spammed to console, seemingly with each step (but not confirmed):

2022-12-22 17:18:26.961576-0500 Diffusor[13565:536479] [espresso] ANE Batch: 2 of the async requests being waited for returned errors. Only the first of these will be surfaced.
2022-12-22 17:18:26.961636-0500 Diffusor[13565:536479] [espresso] ANE Batch: Async request 10 returned error: code=5 err=Error Domain=com.apple.appleneuralengine Code=5 "processRequest:model:qos:qIndex:modelStringID:options:error:: 0xd: Program Inference overflow" UserInfo={NSLocalizedDescription=processRequest:model:qos:qIndex:modelStringID:options:error:: 0xd: Program Inference overflow}
2022-12-22 17:18:26.961655-0500 Diffusor[13565:536479] [espresso] [Espresso::overflow_error] :9

Then after a while, it sends the following error + stacktrace to console:

2022-12-22 17:19:42.715811-0500 Diffusor[13565:536721] [common] processRequest:model:qos:qIndex:modelStringID:options:error:: ANEProgramProcessRequestDirect() Failed with status=0xf : statusType=0x11 lModel=_ANEModel: { modelURL=file:///Users/turnereison/Developer/stable-diffusion/ml-stable-diffusion/models/Resources/VAEDecoder.mlmodelc/model.mil : key={"isegment":0,"inputs":{"z":{"shape":[96,96,1,4,1]}},"outputs":{"image":{"shape":[768,768,1,3,1]}}} : string_id=0x00000000 : program=_ANEProgramForEvaluation: { programHandle=443484529369 : intermediateBufferHandle=443486389195 : queueDepth=32 } : state=3 : programHandle=443484529369 : intermediateBufferHandle=443486389195 : queueDepth=32 : attr={
    ANEFModelDescription =     {
        ANEFModelProcedures =         (
                        {
                ANEFModelInputSymbolIndexArray =                 (
                    0
                );
                ANEFModelOutputSymbolIndexArray =                 (
                    0
                );
                ANEFModelProcedureID = 0;
            }
        );
        kANEFModelInputSymbolsArrayKey =         (
            z
        );
        kANEFModelOutputSymbolsArrayKey =         (
            "image@output"
        );
        kANEFModelProcedureNameToIDMapKey =         {
            net = 0;
        };
    };
    NetworkStatusList =     (
                {
            LiveInputList =             (
                                {
                    BatchStride = 73728;
                    Batches = 1;
                    Channels = 4;
                    Depth = 1;
                    DepthStride = 73728;
                    Height = 96;
                    Interleave = 1;
                    Name = z;
                    PlaneCount = 4;
                    PlaneStride = 18432;
                    RowStride = 192;
                    Symbol = z;
                    Type = Float16;
                    Width = 96;
                }
            );
            LiveOutputList =             (
                                {
                    BatchStride = 3538944;
                    Batches = 1;
                    Channels = 3;
                    Depth = 1;
                    DepthStride = 3538944;
                    Height = 768;
                    Interleave = 1;
                    Name = "image@output";
                    PlaneCount = 3;
                    PlaneStride = 1179648;
                    RowStride = 1536;
                    Symbol = "image@output";
                    Type = Float16;
                    Width = 768;
                }
            );
            Name = net;
        }
    );
} : perfStatsMask=0}
2022-12-22 17:19:43.720777-0500 Diffusor[13565:536479] -[NSNull featureNames]: unrecognized selector sent to instance 0x20c2556e8
2022-12-22 17:19:43.726682-0500 Diffusor[13565:536479] [General] -[NSNull featureNames]: unrecognized selector sent to instance 0x20c2556e8
2022-12-22 17:19:43.733922-0500 Diffusor[13565:536479] [General] (
	0   CoreFoundation                      0x00000001b019f3f8 __exceptionPreprocess + 176
	1   libobjc.A.dylib                     0x00000001afceaea8 objc_exception_throw + 60
	2   CoreFoundation                      0x00000001b0241c1c -[NSObject(NSObject) __retain_OA] + 0
	3   CoreFoundation                      0x00000001b0105670 ___forwarding___ + 1600
	4   CoreFoundation                      0x00000001b0104f70 _CF_forwarding_prep_0 + 96
	5   Diffusor                            0x00000001009e5968 $s15StableDiffusion7DecoderV6decodeySaySo10CGImageRefaGSay6CoreML13MLShapedArrayVySfGGKFAFSiXEfU1_ + 200
	6   Diffusor                            0x00000001009e70d0 $s15StableDiffusion7DecoderV6decodeySaySo10CGImageRefaGSay6CoreML13MLShapedArrayVySfGGKFAFSiXEfU1_TA + 28
	7   libswiftCore.dylib                  0x00000001bdde0038 $sSlsE3mapySayqd__Gqd__7ElementQzKXEKlF + 656
	8   Diffusor                            0x00000001009e4f24 $s15StableDiffusion7DecoderV6decodeySaySo10CGImageRefaGSay6CoreML13MLShapedArrayVySfGGKF + 792
	9   Diffusor                            0x00000001009ff6dc $s15StableDiffusion0aB8PipelineV14decodeToImages_13disableSafetySaySo10CGImageRefaSgGSay6CoreML13MLShapedArrayVySfGG_SbtKF + 104
	10  Diffusor                            0x00000001009fe540 $s15StableDiffusion0aB8PipelineV14generateImages6prompt14negativePrompt10imageCount04stepJ04seed13disableSafety9scheduler15progressHandlerSaySo10CGImageRefaSgGSS_SSS2is6UInt32VSbAA0aB9SchedulerOSbAC8ProgressVXEtKF + 3480
	11  Diffusor                            0x00000001009de878 $s8Diffusor11ContentViewV4bodyQrvgyycfU0_yyYaYbcfU_TY0_ + 1112
	12  Diffusor                            0x00000001009dfe8d $s8Diffusor11ContentViewV4bodyQrvgyycfU0_yyYaYbcfU_TATQ0_ + 1
	13  Diffusor                            0x00000001009e00f9 $sxIeghHr_xs5Error_pIegHrzo_s8SendableRzs5NeverORs_r0_lTRTQ0_ + 1
	14  Diffusor                            0x00000001009e0249 $sxIeghHr_xs5Error_pIegHrzo_s8SendableRzs5NeverORs_r0_lTRTATQ0_ + 1
	15  libswift_Concurrency.dylib          0x00000002397648b5 _ZL23completeTaskWithClosurePN5swift12AsyncContextEPNS_10SwiftErrorE + 1
)

This happens whether using the chunked models or the not chunked one, reduced memory or not.

Dec 22 '22 22:12 crazyroo1

I can use the converted Stable Diffusion v2.1 models in Swift.

MBA/M1/8GB memory, macOS 13.2, Xcode 14.2
Xcode project: https://github.com/ynagatomo/ImgGenSD2

I converted the models with this instruction: % python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder --convert-safety-checker -o sd2CoremlChunked --model-version stabilityai/stable-diffusion-2-1-base --bundle-resources-for-swift-cli --chunk-unet --attention-implementation SPLIT_EINSUM --compute-unit CPU_AND_NE

Dec 23 '22 01:12 ynagatomo

I had a similar issue using stabilityai/stable-diffusion-2-1 and then realized you need to use the base version. Using stabilityai/stable-diffusion-2-1-base similar to how @ynagatomo suggests works fine on my MacBook Pro M1 Max

Dec 23 '22 04:12 JustinMeans

What's the difference between the "base" version and normal Stable Diffusion v2.1?

Dec 24 '22 05:12 hirakujira

base: 512x512 image generation
normal: 768 x768 image generation (needs more working memory)

Dec 24 '22 10:12 ynagatomo

I'm seeing this too with SD 2.1. Using @ynagatomo's suggestion above I tried generating the model with the --chunk-unet parameter and it does indeed seem to work and produce an image, albeit emitting this line at every step on the console: [espresso] [Espresso::overflow_error] /Users/ptsochantaris/[redacted]/UnetChunk2.mlmodelc/model.mil

Edit: It seems that the best way to run SD 2.0 and SD 2.1 is explicitly using the CPU_AND_GPU option, irrespective of chunking or not, and all errors go away. Otherwise there's something there that the NE doesn't like.

Jan 31 '23 00:01 ptsochantaris

Same for me. Model coreml-stable-diffusion-2-1-base split_einsum for calculations on Neural Engine:

[espresso] [Espresso::overflow_error] /.../StableDiffusionV2.1/UnetChunk2.mlmodelc/model.mil

May 09 '23 14:05 wmorgue