Julia

Results 3 comments of Julia

> I'd like to quantize the ffn.down.weight as such without recompiling LlamaCPP Yeah, that's the idea. I actually explained my intentions slightly incorrectly in the first post above. It's actually...

> Moreover, and that's a bit more complex, the ideal combination might be to be able to use a customizable form "more_bits feature" (query it in the llama.cpp file) to...

I think this should be ready. I added parsing of enum values (so that friendly names like Q8_0 can be used instead of their numeric values), wildcards for tensor names,...