DeepSpeed-MII
DeepSpeed-MII copied to clipboard
Mistral 8*7B Out of memory
Does the framework support the mistral 87B model perfectly?
I encountered an Out of Memory error during use.
The machine is 8*A100 80G.
Hi @byerose, yes we do support Mixtral 8x7B model. Can you please share the script you are using? I have been able to run this model on as little as 2xA6000