llama.cpp
llama.cpp copied to clipboard
quantize: improve pattern matching for allowed tensors
This PR implements @slaren's regex matching recommendation for allowed tensors. For example: --tensor-type attn=q4_k will now apply to all tensors named *attn*
Apologies for shotgun approach @ggerganov / @slaren / @ngxson, I'm not sure what the proper process to request a review is. This PR addresses #12511 deficiencies. Happy to close or move to draft if it's not suitable for merging