JohnAlphaIII
Results
2
issues of
JohnAlphaIII
Warp kernel crashes for some input data in fp16 and bf16. E.g. ``` [B C T ] [2, 2, 32768] -- works [4, 2, 32768] -- doesn't [2, 4, 32768]...
Hi, there is a performance [benchmark](https://github.com/NVIDIA/cutlass/blob/main/media/images/cutlass-3.8-blackwell-gemm-peak-performance.svg) in README.md, but there is no link to the code to reproduce it. Can you please point me to the source code for this...
question
? - Needs Triage