LLamaSharp [BUG]: Unknown pre-tokenizer type: 'gpt-4o'

Description

Loading a Microsoft Phi-4-mini-instruct (4bit quantization) model fails with: unknown pre-tokenizer type: 'gpt-4o' This issue was already addressed in llama.cpp b4792.

Reproduction Steps

LLamaWeights.LoadFromFile(new ModelParams("Phi-4-mini-instruct-Q4_K_M.gguf"));

Environment & Configuration

Operating system: Windows 11
.NET runtime version: 9
LLamaSharp version: 0.21.0
CUDA version (if you are using cuda backend): 12

Known Workarounds

Download newer llama.cpp release b4792
Use NativeLibraryConfig.LLama.WithLibrary to use the downloaded llama.dll

Mar 14 '25 09:03 koenigst

This is a normal occurrence as LLamaSharp relies on llama.cpp, which updates very frequently, sometimes multiple times a day. You can directly download and use the updated DLL from the llama.cpp repository.

Mar 14 '25 14:03 sangyuxiaowu

LLamaSharp is not generally compatible with any version of llama.cpp except the exact version it was updated to (see the table in the readme for the versions). The llama.cpp API frequently has breaking changes, so downloading DLLs from a different version will cause all kinds of issues.

There's a WIP PR (https://github.com/SciSharp/LLamaSharp/pull/1126) which will update llama.cpp to a newer version, that should include the new tokenizer type.

Mar 14 '25 14:03 martindevans

This issue has been automatically marked as stale due to inactivity. If no further activity occurs, it will be closed in 7 days.

May 23 '25 00:05 github-actions[bot]