[BUG]: Unknown pre-tokenizer type: 'gpt-4o'
Description
Loading a Microsoft Phi-4-mini-instruct (4bit quantization) model fails with:
unknown pre-tokenizer type: 'gpt-4o'
This issue was already addressed in llama.cpp b4792.
Reproduction Steps
LLamaWeights.LoadFromFile(new ModelParams("Phi-4-mini-instruct-Q4_K_M.gguf"));
Environment & Configuration
- Operating system: Windows 11
- .NET runtime version: 9
- LLamaSharp version: 0.21.0
- CUDA version (if you are using cuda backend): 12
Known Workarounds
- Download newer llama.cpp release b4792
- Use
NativeLibraryConfig.LLama.WithLibraryto use the downloadedllama.dll
This is a normal occurrence as LLamaSharp relies on llama.cpp, which updates very frequently, sometimes multiple times a day. You can directly download and use the updated DLL from the llama.cpp repository.
LLamaSharp is not generally compatible with any version of llama.cpp except the exact version it was updated to (see the table in the readme for the versions). The llama.cpp API frequently has breaking changes, so downloading DLLs from a different version will cause all kinds of issues.
There's a WIP PR (https://github.com/SciSharp/LLamaSharp/pull/1126) which will update llama.cpp to a newer version, that should include the new tokenizer type.
This issue has been automatically marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days.