DeepSpeed
DeepSpeed copied to clipboard
Fix Bloom logits mismatch
Bloom with kernel injection was showing significant logits mismatch compared to Transformer's baseline as reported by issue https://github.com/microsoft/DeepSpeed/issues/2730.
Softmax input_mask is float32, not int64, and needs to be converted to half.
@RezaYazdaniAminabadi