ComfyUI VRAM issue, using A100, but still unload and load new model for each new inference
My workflow: workflow_api_comfyui_vram.json
I am using A100, with 80G VRAM, should be sufficient.
My workflow includes IP adapter. For each new inference, the "Apply IPadapter" Node is loading and unloading some models.
Comfyui start info:
I turn on the gpu-only option to ensure highvram
Is there any way to stop the model unloading and loading to make the inference faster, I should have sufficient VRAM?
@comfyanonymous Could you please share the potential solution to this? Thank you!
Try --gpu-only if you're not already. Sometimes that message shows up when nothing is swapped off or onto the GPU... based on your first vs second run times (assuming those WERE the first and second) the overhead of loading the model was all in the 7.85s run, the 6.2s following didn't do anything different, then something changed with your loaded image and it needed to change the model again and took an extra second. I'm not familiar with how exactly IPAdapters work, if they're creating weights from images they may need to load a fresh set of model weights to work on with the new image when you change images.
I'm having the exact same issue, even using --gpu-only on any IP2Adapter workflow, ClipVision and SDXL Clip Model will load again.
Requested to load CLIPVisionModelProjection Loading 1 new model Requested to load SDXLClipModel Loading 1 new model
Same issue on 4090. IPAdapter part of the workflow:
--gpu-only doesn't change this behavior.
Any solution for this? Same issue running on A100
I was struggling with this problem on RTX 4080 for two weeks and with these changes in the .bat file the problem was solved!
.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --port 8189 --cuda-device 0 --gpu-only
pause