Results 10 comments of Shobhit

> Found a way to solve that issue , I had to change my ollama settings to "Environment="OLLAMA_HOST=PRIVATEIP" to get it exposed ... it looks like if you listen on...

How can I make it work on Linux lol.

Ok I found the solution. Go to the docker-compose.yml file. There you will find this code ``` environment: - TZ=${TIMEZONE} - HF_ENDPOINT=https://huggingface.com ``` You need to change it to this...

Am also curious if I can compile the code and run it on android?

Hi! I will take this up!

Yes, still am. Will share a pull request over the weekend when completed.

Hi @phymbert What is the architecutral reason for having embedding living on a seperate deployment to the model? Becuase requiring that would mean we would need to make changes to...

Ok, Just to clarify, the server.cpp has a route for requesting [embeddings ](https://github.com/ggerganov/llama.cpp/blob/928e0b7013c862cf10701957b3d654aa70f11bd8/examples/server/server.cpp#L3650) but the existing code for the server doesnt include the option to send embeddings for [completions](https://github.com/ggerganov/llama.cpp/blob/928e0b7013c862cf10701957b3d654aa70f11bd8/examples/server/server.cpp#L3397C1-L3398C1) ....

Not yet. Currently testing it on a personal kube cluster with separate node selectors.

@phymbert ive made a pull request.