llama.cpp llama.cpp acts too dumb while running on phone!!

I was trying llama.cpp on phone with termux installed. but look at this image Screenshot_20230406-120404

Specifications The phone has 8 gigs of RAM and 7 gigs is free and the CPU has 8 cores so its not the issue of the RAM and CPU. Model used: alpaca-7B-lora llama.cpp version: latest prompt: chat-with-bob.txt

I really don't know what is causing the issue here. The problem happening is, when i ask a question to it, it just either answers the question in a very dumb way or it just repeats the same question not answering anything. With the same model, prompt and llama.cpp version on my PC with 4GB ram works as expected it answers every question with almost 98% accuracy. Can any of you guys help me out with this? or update the llama.cpp and fix the mobile issues please?

Thankyou

Apr 06 '23 06:04 Shreyas-ITB

try playing with settings, like increasing temperature

Apr 06 '23 10:04 BarfingLemurs

Okay ill try it and see then ill let you know

Apr 06 '23 11:04 Shreyas-ITB

@BarfingLemurs Nope it doesnt work, it stays the same (acts way way dumber and still repeats the question) @gjmulder Please add a bug or some issue label here to this please. It needs some more development.

Apr 06 '23 16:04 Shreyas-ITB

You may want to log an issue with the Stanford Alpaca. It is the training set fo the Alpaca model you are using.

Apr 06 '23 19:04 gjmulder

@gjmulder im using alpaca lora maybe thats the issue? I mean its not the problem of the alpaca model tho it worked fine on my laptop and for many users on their PCs on mobile it malfunctions

Apr 07 '23 04:04 Shreyas-ITB

Maybe because of termux. I don't even know how to use the Firefox in termux to watching YouTube 720p without crash😅😂 I guess it's the ram restrictions from your phone to the termux app.

Apr 07 '23 05:04 FNsi

@FNsi lol its not due to termux. The same issue happens with userland (an app thats intended to run ubuntu on android without root)

Apr 07 '23 06:04 Shreyas-ITB

@FNsi lol its not due to termux. The same issue happens with userland (an app thats intended to run ubuntu on android without root)

I think it's almost the same since there all the emulated terminal? I saw an android fork in Google market,maybe you can try it?

https://github.com/ggerganov/llama.cpp/discussions/750

Apr 07 '23 06:04 FNsi

The same model should work the similarly with llama.cpp on any platform, if the same temp, top_k, etc. parameters are being passed to llama.cpp. The random number generator is different so you will never get the exact same output, but the outputs should be simular in quality. The only other other difference I could imagine would be performance.

Apr 07 '23 06:04 gjmulder

There are various optimized code paths that are only enabled for certain platform and feature sets, there could be differences in the implementation of those.

Could you post the initial output with the system_info and model parameters?

Apr 08 '23 00:04 unbounded

@unbounded the output i get is in the start of the issue (there is a screenshot of what the model is saying) Model parameters are the same as the chat.sh file in the repository's example directory.

System Info Arm cortex A53 octa core processor 8 GB RAM Android 12 and there is no AVX or AVX2 Flags in the CPU as its an ARM processor.

Apr 08 '23 04:04 Shreyas-ITB

Could be related to #876 which was fixed in https://github.com/ggerganov/llama.cpp/commit/684da25926e5c505f725b4f10b5485b218fa1fc7

Apr 10 '23 22:04 unbounded

Closing as assumed fixed by https://github.com/ggerganov/llama.cpp/commit/684da25926e5c505f725b4f10b5485b218fa1fc7 , feel free to reopen if this still happens with the latest version.

Apr 15 '23 17:04 unbounded