TheVoidArbiter

Results 1 comments of TheVoidArbiter

I have the same Problem with the image localai/localai:v2.21.1-cublas-cuda12-core but only when using llama.cpp with parallel requests enabled. So the metrics get generated for every slot but with the same...