llama-cpp-python icon indicating copy to clipboard operation
llama-cpp-python copied to clipboard

Loading sharded (GGUF) model files from HF with LLama.from_pretrained() 'additional_files' argument

Open Gnurro opened this issue 1 year ago • 0 comments

Added code allows to specify multiple files to load via HuggingFace Hub in LLama.from_pretrained(). New argument takes a List of strings, which are used the same as the 'file_name' string argument. Code could likely be more elegant (ie parallel downloads, but I'm not familiar enough with the HF hub library), but it works. Tested and working on Windows10 and Ubuntu (inside a Docker stack).

Gnurro avatar May 14 '24 14:05 Gnurro