DeepSpeed
DeepSpeed copied to clipboard
memory reallocation for bigger batch size
Hi @reymondzzzz Thanks for the PR. I see this can fix some assumptions we have on model size or batch size during the runtime. But, would you mind give a description here to see what it solves? I also see that you opened an issue about using ds-inference for a GPT-based model, is this PR related to that? Thanks, Reza
Closing due to age and lack of description.