Unable to generate sd3 medium images with 24gb gpu
LocalAI version: localai/localai:v2.17.1-cublas-cuda12
Environment, CPU architecture, OS, and Version: Linux sphinx 6.5.0-28-generic #29~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr 4 14:39:20 UTC 2 x86_64 x86_64 x86_64 GNU/Linux RTX 4090
Describe the bug When trying to generate a 512x512 or 256x256 image it runs the 24gb gpu out of memory
To Reproduce
curl http://localhost:8080/v1/images/generations
-H "Content-Type: application/json"
-d '{
"prompt": "a test image",
"model": "stable-diffusion-3-medium",
"size": "256x256"
}'
Expected behavior generating an image
Logs
10:29PM ERR Server error error="could not load model (no success): Unexpected err=RuntimeError('CUDA error: out of memory\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\nCompile with TORCH_USE_CUDA_DSA to enable device-side assertions.\n'), type(err)=<class 'RuntimeError'>"
me too, I'm using 32GB v100, does it need that much memory?
I have the same problem on a 24GB Tesla P40.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.