DeepSpeed
DeepSpeed copied to clipboard
Fix memory alignment bug in Stage 3 that gets triggered when number o…
…f params are not multiple of the world size
It appears this fixed the previous issue in one spot. However, now I am seeing the same error in a new assert:
https://github.com/microsoft/DeepSpeed/blob/da71a8975d7387c903c32abd4ec0ff6f174980e0/deepspeed/runtime/zero/stage3.py#L2251-L2253

Can one of the admins verify this patch?
Pretty sure this is not needed anymore, the code around this spot has changed significantly since then. @tjruwase do you know more here?
@jeffra, yes okay to close.