BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

Title: TL1/TL2 codegen fails for any configuration with bm=16 on Windows 11

Open Chen-Shanpu opened this issue 2 months ago • 0 comments

When generating TL1/TL2 kernels with bm=16, all configurations fail either during (1) codegen_tl1.py / codegen_tl2.py execution, or (2) CMake build (llama-bench build failure).

This happens consistently for all BM/BK settings. Other block sizes (e.g., bm=32, bm=64, bm=128) work normally.

Chen-Shanpu avatar Dec 03 '25 11:12 Chen-Shanpu