InvokeAI [enhancement]: don't full load model on menu choice, load it on first invoke

Is there an existing issue for this?

[X] I have searched the existing issues

Contact Details

No response

What should this feature add?

currently we waste a lot of time waiting on the UI I understand that each model activates different option but what can be done is load the meta of the mode only and in the bg load the rest async, to the cpu and gpu

Alternatives

wait

Aditional Content

No response

May 08 '23 23:05 laurentopia

This is how model loading is done in upcoming 3.0. Model loading is deferred until the model is needed for generation or another operation. Right now, you can improve loading speed by increasing the --max_loaded_models value in invokeai.init. This will keep more models cached in CPU RAM, making it fast to switch between them. In addition, permanently converting checkpoint files into diffusers will give you a speedup of 3-4X.

May 10 '23 11:05 lstein

Great, I'll sit on it. The funny thing is that I already cranked up the max loaded model and this had no effect on load time. It's possible that the pcie3x16 bridge to the gpu is the bottleneck, not the super fast SSD, could be wrong though.

May 12 '23 04:05 laurentopia

So... this is ironic. The thing I don't like about 3.0 is that it doesn't preload the model, so you have to wait extra after you tell it to go. Maybe this should just be an async API call, with the session not starting until the model matches?

Jul 12 '23 17:07 davemedvitz

Is there a reason why I have " Loading model" everytime I start a new generation ( even if I don't change the model )

It takes a lot of time to start each generation since it looks like it's reloading everything for each generation.

Jul 27 '23 21:07 Qualzz

This is how model loading is done in upcoming 3.0. Model loading is deferred until the model is needed for generation or another operation. Right now, you can improve loading speed by increasing the --max_loaded_models value in invokeai.init. This will keep more models cached in CPU RAM, making it fast to switch between them. In addition, permanently converting checkpoint files into diffusers will give you a speedup of 3-4X.

Can you notify us here when release 3 is live?

Aug 11 '23 18:08 laurentopia

I'm using the 3.4.0rc2 and the model still reload at each generation

Oct 31 '23 22:10 julien-blanchon

Models are loaded during generation - They are cached based on user settings.

Feb 21 '24 16:02 hipsterusername