Ed Addario comments

Results 25 comments of


                                            Ed Addario

imatrix : use GGUF to store importance matrices

**Really** looking forward to this PR being merged into master! In the meantime, you may already know this but passing along a tip shared by @David-AU-github in [here](https://github.com/ggml-org/llama.cpp/pull/12718#issuecomment-2889528626) that has...

imatrix : use GGUF to store importance matrices

Echoing @nicoboss' sentiment. This is a very nice enhancement @compilade. Thank you. I've been testing by running different permutations of options, including roundtrips on each test, and comparing the resulting...

imatrix : use GGUF to store importance matrices

Assuming no additional changes on this PR, the enhanced version of #12718 is ready to go as soon as this one is merged

changelog : `libllama` API

PR #12511 changed the llama_model_quantize_params API by introducing an additional void * tensor_types parameter

Add imatrix support

Haven't had much of an opportunity to play with T2I models yet but if someone can point me to a sample model and imatrix file, happy to make the necessary...

quantize: improve pattern matching for allowed tensors

Apologies for shotgun approach @ggerganov / @slaren / @ngxson, I'm not sure what the proper process to request a review is. This PR addresses #12511 deficiencies. Happy to close or...

imatrix: add option to display importance score statistics for a given imatrix file

Thank you @ngxson. Yes, it will process any imatrix file produced by llama-imatrix, but it is restricted to single file (does not deal with multiple --in-file)

imatrix: add option to display importance score statistics for a given imatrix file

Not sure if I'm understanding the comment correctly @jukofyork, but the logic I'm using to identify the most influential tensors/layers is to simply average the importance scores (IS) for each,...

imatrix: add option to display importance score statistics for a given imatrix file

Very clear now, thanks @compilade. You're correct, I'm using the mean squared activation averaged to identify which tensors/layers produce large magnitude activations and ~~whilst~~ agree it isn't as accurate as,...

imatrix: add option to display importance score statistics for a given imatrix file

Had a chance to think this more thoroughly and now get the implications of @jukofyork and @compilade's comments. Agree my current approach is not really identifying influence but rather score...