Add mps-cpu support
Summary
・Adapted stablediffusion to work on M1 mac(mps). ・ Created an automatic device detection system for #147
Feature
・Support mps for txt2img.py, img2img.py, gradio and streamlit ・Add mac config(v2-inference-v-mac.yaml)
Of course, it can also run on cuda
Hi, nice work.. Can share perf on M1 CPU and GPU like https://github.com/Stability-AI/stablediffusion/pull/147#issuecomment-1402153477 for recent Intel IPEX pull request?
This is the result in my environment for txt2img.py!
| CPU | Memory |
|---|---|
| Apple M1 | 16GB |
| Batch size | Model name | device | argument | Avg gen time | How fast compared to cpu |
|---|---|---|---|---|---|
| 1 | SD2-v (768px) | CPU | --precision full | 273s | N/A |
| 1 | SD2-v (768px) | MPS | 100s | 273% | |
| 4 | SD2-v (768px) | CPU | --precision full | 1010s | N/A |
| 4 | SD2-v (768px) | MPS | 577s | 175% | |
| 1 | SD2-base (512px) | CPU | --precision full | 271s | N/A |
| 1 | SD2-base (512px) | MPS | 98s | 276% | |
| 4 | SD2-base (512px) | CPU | --precision full | 1014s | N/A |
| 4 | SD2-base (512px) | MPS | 600s | 169% |
Thanks for testing! Hope can be merged soon!
If there is a problem, I'll revert 90d4c71
Can confirm this works more or less on M2 w/ 16GB (although I'm not sure how you generated 768x768, that crashes for me in allocating a buf)