llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Mixtral 8x22B mixing up syllables

Open stefanvarunix opened this issue 1 year ago • 0 comments

Has anyone experienced this:

I get weird typos in German, e.g. vowels are typed twice ("ii" instead "i", or "aa" instead "a"), mixing up syllables up to writing write gibberish, mainly mixing syllables and letters. The longer the chat (e.g. the context), the worse it gets. In the beginning (e.g. first chat responses), it seems ok.

I downloaded and tested models from https://huggingface.co/MaziyarPanahi (different quants) and https://huggingface.co/mradermacher/Mixtral-8x22B-Instruct-v0.1-GGUF, did GGUF generation and quantization myself etc. Nothing helped.

The inferencing itself works and is quite fast (Apple M1 Ultra 128 GB 64C GPU).

I use the latest llama.cpp server and API (./server -m ...), chat via Panel ChatInterface, accessing the llama.cpp HTTP API with OpenAI's python library.

Is this maybe a problem with tokenizers or chat templates?

stefanvarunix avatar Apr 23 '24 13:04 stefanvarunix