ml-stable-diffusion icon indicating copy to clipboard operation
ml-stable-diffusion copied to clipboard

Pix2Pix assertion failed

Open kacperd opened this issue 1 year ago • 1 comments

Hello!

I'm trying to use a model https://huggingface.co/timbrooks/instruct-pix2pix. I successfully converted it to a CoreML model. However when I'm trying to run the pipeline, it crashes with an error:

StableDiffusion/Scheduler.swift:278: Assertion failed

It's specifically the assertion in this part of the code:

 func convertModelOutput(modelOutput: MLShapedArray<Float32>, timestep: Int, sample: MLShapedArray<Float32>) -> MLShapedArray<Float32> {
        assert(modelOutput.scalarCount == sample.scalarCount)
        let scalarCount = modelOutput.scalarCount
        let (alpha_t, sigma_t) = (self.alpha_t[timestep], self.sigma_t[timestep])
        ....
}

Can this somehow be avoided by adjusting StableDiffusionPipeline.Configuration?

kacperd avatar Aug 22 '24 09:08 kacperd

@kacperd The model you linked has a unique input preparation logic for the Unet model, i.e. It needs 8 channels as opposed to 4.

From Section 3.2 of the paper:

"We therefore initialize the weights of our model with a pretrained Stable Diffusion checkpoint, leveraging its vast text-to-image generation capabilities. To support image conditioning, we add additional input channels to the first convolutional layer, concatenating zt and E(cI)."

If you implement this, please contribute :)

atiorh avatar Aug 29 '24 19:08 atiorh