BitNet
BitNet copied to clipboard
Refactor TL2 codegen
The codegen for TL2 was a bit difficult to reason about since the C++ code was directly embedded as a Python string. With this PR, I've added a simple Jinja2 template which contains the whole file, making it much easier to follow as well as make changes.
Other notable changes:
- added
jinja2to requirements - don't use
defines butconstexpr autoinstead - some of the
pragma unrollloops have been made portable (notably to GCC) using theUNROLL_LOOPmacro (see also #83)
The generated bitnet-lut-kernels.h is more or less identical (with some slight whitespace differences).
Note that I didn't do the same for TL1 codegen since I'd first like to get feedback on whether this is a step in the right direction.