nvMelissa
nvMelissa
**Is your feature request related to a problem? Please describe.** N/A **Describe the solution you'd like** Support DeepSeek FP8 recipe in JAX. Already supported in Pytorch. **Describe alternatives you've considered**...
**Is your feature request related to a problem? Please describe.** At the moment, we enumerate the parameters in C APIs like this: https://github.com/NVIDIA/TransformerEngine/blob/5e4e0b2c378d2b1ec2ee65dfa85124e1dd805389/transformer_engine/common/fused_attn/fused_attn.cpp#L835 As we add more features to attention,...
**Is your feature request related to a problem? Please describe.** This is not related to a problem, it is a feature request to expand model coverage **Describe the solution you'd...
Is your feature request related to a problem? Please describe. To be added Describe the solution you'd like Work on improving performance for FP8 current scaling Describe alternatives you've considered...
**Is your feature request related to a problem? Please describe.** The logic around cuDNN's support matrix for SDPA is getting long and hard to maintain. **Describe the solution you'd like**...
## 🚀 Feature Dynamic Shapes - Rework Thunder to be shape-agnostic wherever possible. We should only specialize code for specific shapes in small, targeted areas where it's truly necessary. ###...
## 🚀 Model / language coverage Investigate and fix slowdowns and long compile times for target MoE models: Llama 4, GPT OSS, DeepSeek V3.1, and Qwen3-Next ### Pitch ### Alternatives...
## 🚀 Feature Match and replace computation, in particular sequences of multiple operations. This feature will make it easier and faster for developers to improve model performance of new models...