fei-xx
fei-xx
@hwu36 Confirmed it uses nvcc 12.3. I'm the user [at this issue](https://github.com/NVIDIA/cutlass/issues/1044#issuecomment-1974291945). I could compile the code there with `-G`, but not the above example code.
> thank you. i can reproduce and it is reported to nvcc team now. @hwu36 kindly follow up with any updates?
@hwu36 @ANIKET-SHIVAM , it looks like cutlass profiler does not support nvfp4 groupgemm. Below is my command: ``` tools/profiler/cutlass_profiler --operation=grouped_gemm --m=4096 --n=4096 --k=4096 --num_groups=4 --runtime_input_datatype_a=e2m1 --runtime_input_datatype_b=e2m1 ``` Any ideas of...
BTW, does `blackwell_grouped_gemm_block_scaled` support `split-k` or `sliced-k`?
I'm also seeing the similar issue. Any updates?