Stable Diffusion Quantization

Open pushkarjain1009 opened this issue 1 year ago • 1 comments

Hey, Can you please elaborate on Quantisation method you used here for SD-1.4. I am trying to implement similar project bit stuck with Quantisation process. I presume you user INT-8 quantisation for deployment on Mobile Device. How you have achieved that in both TFLITE format and ONNX format. Can you please help me with that.

May 29 '24 09:05 pushkarjain1009

I applied dynamic quantization to both the TFLite models: the diffusion model and the text_encoder model. However, I encountered difficulties with the diffusion model due to its large size and couldn't find a suitable method to quantize it using the ONNX library at that time. Furthermore, the inference time of the text_encoder model did not significantly improve with the INT8 ONNX model, so I decided to keep the TFLite version for simplicity. Attached is the notebook I used for converting and quantizing these models in this project.

integerquant.zip

May 29 '24 10:05 Anthrapper