niubi.ai

Results 2 comments of niubi.ai

Any way to run as quickly as using llama.cpp directly? I need to save each input and response

> @dansinboy are you using the default server binary that comes with llama.cpp or a binding? you get the point , at first I used a binding mode with llama_cpp_python,...