RedmiS22018
Results
2
issues of
RedmiS22018
Instead of needing to load the weights into memory compare every byte in the LLaMA & Delta files and add the delta to the bytes, loading 4KB at a time,...
Byte deltas
12
Instead of using parameter deltas this implementation compares each byte in the delta and in the LLaMA model and outputs the vicuna model. This offers significntly less RAM usage compared...