TransformerEngine
TransformerEngine copied to clipboard
Support DeepSeek FP8 recipe in JAX
Is your feature request related to a problem? Please describe. N/A
Describe the solution you'd like Support DeepSeek FP8 recipe in JAX. Already supported in Pytorch.
Describe alternatives you've considered N/A
Notes/ Additional context Blockscaled activations and layernorm are optimizations and not needed for functionality