Duccio Gasparri
Duccio Gasparri
It's a bug in the install script. > xec: ./quantize C:\Users\XXX\dalai\llama\models\7B\ggml-model-f16.bin C:\Users\XXX\dalai\llama\models\7B\ggml-model-q4_0.bin 2 in C:\Users\XXX\dalai\llama\build\Release The install script is running the command from C:\Users\XXX\dalai\llama\build\Release , but the quantize.exe was built...
I have the same issue, I'm running Windows 10 on a Intel [email protected], 32GB ram and Nvidia GTX1050. The 7B model is painfully slow to run, it uses less than...
Memory is pretty straightforward (from the docs): 7B => ~4 GB 13B => ~8 GB 30B => ~16 GB 65B => ~32 GB Those are optimistic estimates, add +1GB each....
Do you use the same definition of token used by Openai, that is, 1 token ~ 750 words? Can you try to set the threads to 8 or 4? Just...
I have the same problem. This has been addressed in the Llama model but no patch is available yet (read the last comment from @ggerganov) https://github.com/ggerganov/llama.cpp/issues/599
I have a similar issue, but the behavior is not affected by the required field. I'm using Using 7.14.0-SNAPSHOT via Docker. My openapi.yaml: ```yaml date_estimate_to_iso8601: nullable: true type: string title:...