ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

Improved memory management.

Open comfyanonymous opened this issue 1 year ago • 4 comments

These changes makes the memory management less fragile (much less chances of custom nodes/extensions/future code changes breaking it) and should remove the noticeable delay when changing workflows with large models.

The reason I'm making a PR with these changes so people can test it and make sure there's no obvious bugs before I merge it.

comfyanonymous avatar Nov 01 '24 06:11 comfyanonymous

I've been testing this and it seems to work mostly fine.

However, my prompt control nodes seem to have some problems with LoRA switching since patch_model() in ModelPatcher doesn't appear to modify the model weights anymore. I fixed that by explicitly executing load_models_gpu() after doing LoRA swapping, but it's kind of slow.

Should LoadedModel.model_load pass force_patch_weights to model_use_more_vram? It's currently just ignoring the parameter apparently.

I also have another problem where some model reference becomes None if I switch models, but I haven't figured out why and if that's also a bug in some of the nodes I use. I'll try to see if I can actually reproduce that problem with a simpler workflow.

It's likely these are just bugs in how my custom nodes, but I thought I'd let you know anyway.

asagi4 avatar Nov 03 '24 13:11 asagi4

Should LoadedModel.model_load pass force_patch_weights to model_use_more_vram? It's currently just ignoring the parameter apparently.

Good catch.

However, my prompt control nodes seem to have some problems with LoRA switching since patch_model() in ModelPatcher doesn't appear to modify the model weights anymore. I fixed that by explicitly executing load_models_gpu() after doing LoRA swapping, but it's kind of slow.

The code of the patch_model function hasn't changed, how are you using it?

comfyanonymous avatar Nov 04 '24 10:11 comfyanonymous

The code of the patch_model function hasn't changed, how are you using it?

I install a monkey patch that hijacks the callback during sampling to add and remove LoRA patches and then calls patch_model to update the weights in memory before the next step, and for whatever reason that stopped working with these patches until I forced the model to be loaded onto the GPU. I'm not sure what exactly is going wrong with it.

The whole thing is honestly a huge pile of hacks so it's entirely possible it worked merely by accident before and this change is just exposing some bugs.

asagi4 avatar Nov 04 '24 10:11 asagi4

This will be merged in: https://github.com/comfyanonymous/ComfyUI/pull/5583 so please go test that one.

comfyanonymous avatar Nov 11 '24 19:11 comfyanonymous