DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

estimate_zero2_model_states_mem_needs: fixing memory estiamtion

Open nelyahu opened this issue 1 year ago • 2 comments

was considering 4 bytes per model param, and 4 bytes per gradient. fixed it to 2 bytes - under the assumption of FP16/BF16

nelyahu avatar Feb 08 '24 08:02 nelyahu

@tjruwase @stas00 - What do you say about the updated patch?

nelyahu avatar May 13 '24 06:05 nelyahu

@tjruwase @stas00 - What do you say about the updated patch?

@nelyahu, LGTM. Thanks!

tjruwase avatar May 13 '24 09:05 tjruwase

@tjruwase @stas00 can you please re-run validation? the failure in "cpu-torch-latest" does not seem related

nelyahu avatar May 27 '24 13:05 nelyahu

@tjruwase can be merged?

nelyahu avatar Jun 02 '24 06:06 nelyahu