LLMFarm icon indicating copy to clipboard operation
LLMFarm copied to clipboard

Aya-23-8B gibberish if metal AND mmap turned on

Open mounta11n opened this issue 1 year ago • 4 comments

Hi there, i wanted to run Aya-23-8B from cohere on the iPad pro 12,9“ M1, since Aya is an excellent multilingual llm. It turned out that it gives me gibberish whenever metal and mmap is turned on at the same time. I i tried it a lot of times with different configurations. It seems to be independent from which prompt format and other options. It it is only metla-AND-mmap dependent.

If metal=OFF and mmap=OFF, the time until first token is very long, but it gives me coherent answers, but very very slow (like ~0,15 t/s ).

If metal=OFF and mmap=ON, the time until first token is short (so seems mmap is really on) and it gives me coherent answers, but very very slow (like ~0,15 t/s ).

If metal=ON and mmap=OFF, it gives me coherent and correct answers at ~8 t/s.

If metal=ON AND mmap=ON, it spit out only gibberish, and interestingly much faster than the case before. Here I get 12 t/s for some reason.

—-

I have been using

  • Aya-23-8B-Q4_K_S.gguf
  • iPad pro 12,9“ M1
  • iPad-OS 18.0 Developer Beta
  • LLMFarm 12.5 from Testflight
  • Various Prompt Formats

—-

metal=off mmap=off

IMG_0645

metal=on mmap=on

IMG_0646

mounta11n avatar Jun 22 '24 09:06 mounta11n

Hi. What version of LLMFarm are you using?

guinmoon avatar Jun 22 '24 10:06 guinmoon

I tried it with:

  • v. 1.2.0 (Appstore)
  • v. 1.2.5 (Testflight)

mounta11n avatar Jun 24 '24 01:06 mounta11n

Hi. What prompt format are you using?

Try this:

<|START_OF_TURN_TOKEN|><|USER_TOKEN|>{prompt}<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|><|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

savkinavmono avatar Jun 24 '24 05:06 savkinavmono

I think I was already using this prompt format. And note that in the two pictures above the configs are exactly the same, only mlock was turned off in the first picture and turned on in the second picture.

mounta11n avatar Jul 30 '24 13:07 mounta11n