Thomas Neff

Results 1 issues of Thomas Neff

**Describe the bug** We noticed `NaN` being generated from the CUDNN SDPA backward in our training runs. After digging a bit, we narrowed it down to a minimal repro case...