Improved memory management.
These changes makes the memory management less fragile (much less chances of custom nodes/extensions/future code changes breaking it) and should remove the noticeable delay when changing workflows with large models.
The reason I'm making a PR with these changes so people can test it and make sure there's no obvious bugs before I merge it.
I've been testing this and it seems to work mostly fine.
However, my prompt control nodes seem to have some problems with LoRA switching since patch_model() in ModelPatcher doesn't appear to modify the model weights anymore. I fixed that by explicitly executing load_models_gpu() after doing LoRA swapping, but it's kind of slow.
Should LoadedModel.model_load pass force_patch_weights to model_use_more_vram? It's currently just ignoring the parameter apparently.
I also have another problem where some model reference becomes None if I switch models, but I haven't figured out why and if that's also a bug in some of the nodes I use. I'll try to see if I can actually reproduce that problem with a simpler workflow.
It's likely these are just bugs in how my custom nodes, but I thought I'd let you know anyway.
Should LoadedModel.model_load pass force_patch_weights to model_use_more_vram? It's currently just ignoring the parameter apparently.
Good catch.
However, my prompt control nodes seem to have some problems with LoRA switching since patch_model() in ModelPatcher doesn't appear to modify the model weights anymore. I fixed that by explicitly executing load_models_gpu() after doing LoRA swapping, but it's kind of slow.
The code of the patch_model function hasn't changed, how are you using it?
The code of the patch_model function hasn't changed, how are you using it?
I install a monkey patch that hijacks the callback during sampling to add and remove LoRA patches and then calls patch_model to update the weights in memory before the next step, and for whatever reason that stopped working with these patches until I forced the model to be loaded onto the GPU. I'm not sure what exactly is going wrong with it.
The whole thing is honestly a huge pile of hacks so it's entirely possible it worked merely by accident before and this change is just exposing some bugs.
This will be merged in: https://github.com/comfyanonymous/ComfyUI/pull/5583 so please go test that one.