panml
panml copied to clipboard
Model size and memory needed to run
Is there a way to figure in advance how much memory :
- It will require to LOAD the model
- It will require to RUN the model
f.e. when I try to load google/flan5-large it seems to initially consume ~6GB of RAM and then settles down to ~3GB which is the ~file size.
- Is this normal behavior ? i.e. requiring double amount of RAM ?
- Some models seem to be multiple files ! How do you figure the needed RAM ?
f.e. https://huggingface.co/stabilityai/stablelm-base-alpha-3b/tree/main
Yeah I'm thinking about this too. Definitely a good one to put in. I think we can have a lookup and also pull this info when user asks
Maybe this is the reason it doubles the CPU RAM https://discuss.huggingface.co/t/how-much-memory-required-to-load-t0pp/10904