Thomas Neff
Results
1
issues of
Thomas Neff
**Describe the bug** We noticed `NaN` being generated from the CUDNN SDPA backward in our training runs. After digging a bit, we narrowed it down to a minimal repro case...