[BUG] failing assert in 57_hopper_grouped_gemm example
Describe the bug When running 57_hopper_grouped_gemm using large enough output [pipeline_check_is_consumer] (https://github.com/NVIDIA/cutlass/blob/main/include/cutlass/pipeline/sm90_pipeline.hpp#L84) assert is failing.
It is masked by NDEBUG being defined by default, but if one were to, e.g, add #undef NDEBUG to the top of 57_hopper_grouped_gemm.cu file, it would fail.
assert is triggered, because producer threads call consumer_wait when before updating descriptors for the next group.
Steps/Code to reproduce bug
-
add
#undef NDEBUGto the top of 57_hopper_grouped_gemm.cu -
make 57_hopper_grouped_gemm -
./57_hopper_grouped_gemm --m=512 --n=512 --k=7168 --groups=256 --alpha=1.0 --beta=0.0
@ANIKET-SHIVAM CC
@thefacetakt did you build cutlass in Debug mode using the CMake? I want to confirm which build setting(s) this gets triggered under. Bcoz those debugging checks were only intended to be enabled during debug mode.
Also, that piece of code is going to be updated soon due to programming model updates.
Hi @ANIKET-SHIVAM,
Initially, I was trying to adapt 57_hopper_grouped_gemm example to be used in https://github.com/tgale96/grouped_gemm (which is built without NDEBUG flag, but not in debug mode) when I first noticed the issue.
Then I was able to reproduce it in 57_hopper_grouped_gemm example by just adding #undef NDEBUG to the top of the file.
I am wondering if this assert may impact output correctness or I can just build with NDEBUG enabled to ignore it and everything will work out fine?
For now in your use case, you can just build with NDEBUG enabled to ignore it and everything should work out fine.
Thanks!
Closing this now.