gemma.cpp icon indicating copy to clipboard operation
gemma.cpp copied to clipboard

TODO (**Optimize, potentially using new VQSort PartialSort**)

Open enum-class opened this issue 1 year ago • 1 comments

I have a question about this TODO (Optimize, potentially using new VQSort PartialSort) in here: I want to do it but I'm struggling to find a clean solution. Can you help me out?

Initially, it seems VQSelect is just enough since create_distribution doesn't need sorted probabilities.

One idea is to create an array of key-value pairs (something like K32V32) from the probabilities and their indexes, then apply VQSelect and pass the first 'k' elements to 'create_distribution'. But this involves allocating and copying a potentially large probabilities array and requires a special structure for comparison, something like OrderDescendingKV64.

Another idea is to create a special version of VQSelect just for this case.

Or simply leave the code as it is. What do you think?

enum-class avatar Mar 29 '24 04:03 enum-class

Thanks for considering this! I think it's fairly low on the profile, so let's focus on other things first, in particular the prefill batching and matmul. I'm working on a plan for those and will post an issue soon with a proposed roadmap :)

jan-wassenberg avatar Apr 02 '24 12:04 jan-wassenberg

Closing, I think this is not very time-critical still :)

jan-wassenberg avatar Jul 15 '24 10:07 jan-wassenberg