llama-cpp-python
llama-cpp-python copied to clipboard
Fix disk-cache LRU logic
The LlamaDiskCache tried to implement LRU logic, but did not succeed. getitem did a pop() from the cache, but no push (that was commented out). This results in a miss-hit-miss-hit-... behavior if the same prompt is executed repeatedly. setitem tried to reorder things, which does not make sense if there is always only one or zero elements.
The solution is as simple as it gets: The used "diskcache" already has implemented LRU behavior by default, so LlamaDiskCache does not need to do anything, just "get" and "set".