InfiniTransformer
InfiniTransformer copied to clipboard
Support Zero-3?
I used accelerate launch with ZERO-3 to run train.llama.infini.noclm.1Mseq.sh. But I got this: RuntimeError: Function 'LinearFunctionForZeroStage3Backward' returned nan values in its 0th output
I have this question too