LocalAI Running on GPU way slower than running on CPU

LocalAI version: Latest (v1.25.0)

Environment, CPU architecture, OS, and Version: GPU : NVIDIA GeForce MX250 (9.9 GB) CPU : 15.8 GB

Describe the bug I tried running LocalAI using flag --gpus all : docker run -ti --gpus all -p 8080:8080 -e "DEBUG=true" -e "THREADS=14" -e "REBUILD=false" -v MODEL_PATH:/build/models:cached quay.io/go-skynet/local-ai:v1.25.0-cublas-cuda11 For the same message chat completion, using GPU takes 34 minutes, using CPU takes only 5 minutes

Expected behavior Using GPU is faster

Logs I notice something interesting with this log messages WARNING: failed to allocate 1024.00 MB of pinned memory: out of memory WARNING: failed to allocate 512.00 MB of pinned memory: out of memory

Question

What is happen?
Is there any way to config "pinned memory"?

Sep 11 '23 10:09 sandriansandy

LocalAI version: Latest (v1.25.0)

Environment, CPU architecture, OS, and Version: GPU : NVIDIA GeForce MX250 (9.9 GB) CPU : 15.8 GB

Describe the bug I tried running LocalAI using flag --gpus all : docker run -ti --gpus all -p 8080:8080 -e "DEBUG=true" -e "THREADS=14" -e "REBUILD=false" -v MODEL_PATH:/build/models:cached quay.io/go-skynet/local-ai:v1.25.0-cublas-cuda11 For the same message chat completion, using GPU takes 34 minutes, using CPU takes only 5 minutes

Expected behavior Using GPU is faster

Logs I notice something interesting with this log messages WARNING: failed to allocate 1024.00 MB of pinned memory: out of memory WARNING: failed to allocate 512.00 MB of pinned memory: out of memory

Question

What is happen?

Is there any way to config "pinned memory"?

Can you please post your models yaml file?

Sep 15 '23 04:09 lunamidori5

In case you use Windows as a Docker Host, make sure your models live inside of the linux fs and not the windows fs. The Transferrate from Windows as a File System host is really slow if you want to run stuff inside of wsl.

Sep 27 '23 07:09 MrKinauJr

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Dec 03 '25 02:12 github-actions[bot]

I am going to close this issue for now, please tag me if this is in error

Dec 03 '25 17:12 lunamidori5