mistral.rs icon indicating copy to clipboard operation
mistral.rs copied to clipboard

Enable multiple CPU from arguments

Open lij55 opened this issue 1 year ago • 6 comments

I have a 32 core AMD CPU and no GP. mistral.rs will only use two of the cores. 2 cores is a bit less. Is it possible to allow to set it through arguments? Ollama will use half of core numbers by default.

Thanks!

lij55 avatar Aug 13 '24 04:08 lij55

Hi @lij55 can you please let me know what the command you are running is?

EricLBuehler avatar Aug 14 '24 11:08 EricLBuehler

sorry for late reply. it is target/release/mistralrs-server -i plain -m microsoft/Phi-3-mini-128k-instruct -a phi3

lij55 avatar Aug 17 '24 00:08 lij55

It also depends on which backend you are using. Is it default backend or mkl? (Depending on whether which version of gemm is being utilized)

Each of these have different settings for number of cpus to be used. For instance, mkl is controlled by OMP_NUM_THREADS or MKL_NUM_THREADS environment variable. Irc, candle default backend is controllled by RAYON_NUM_THREADS, try to play with these environment variables to see if there is any change.

But, it is still weird that only 2 cores are being used.

Again, we need to know which cpu backend is used to solve the issue

mert-kurttutan avatar Sep 03 '24 00:09 mert-kurttutan

it is the default backend. I will try OMP_NUM_THREADS environments. I didn't notice it. Thanks in advance!

lij55 avatar Sep 03 '24 07:09 lij55

@lij55 did this work?

EricLBuehler avatar Sep 19 '24 01:09 EricLBuehler

sorry for late reply. I tried both OMP_NUM_THREADS and MKL_NUM_THREADS but no effect. It still use 2 cores. Is it related with model ? I used phi3 by target/release/mistralrs-server -i plain -m microsoft/Phi-3-mini-128k-instruct -a phi3

Now I'm using ollama.

lij55 avatar Sep 19 '24 03:09 lij55