evkogs

Results 1 issues of evkogs

I'd like to propose a feature for implementing fail-safe mechanisms and partial redundancy in FSDP2 (possibly not FSDP already, more like HSDP) to allow for more robust training on unreliable...

enhancement