Shobhit
Shobhit
> Found a way to solve that issue , I had to change my ollama settings to "Environment="OLLAMA_HOST=PRIVATEIP" to get it exposed ... it looks like if you listen on...
How can I make it work on Linux lol.
Ok I found the solution. Go to the docker-compose.yml file. There you will find this code ``` environment: - TZ=${TIMEZONE} - HF_ENDPOINT=https://huggingface.com ``` You need to change it to this...
Am also curious if I can compile the code and run it on android?
Hi! I will take this up!
Yes, still am. Will share a pull request over the weekend when completed.
Hi @phymbert What is the architecutral reason for having embedding living on a seperate deployment to the model? Becuase requiring that would mean we would need to make changes to...
Ok, Just to clarify, the server.cpp has a route for requesting [embeddings ](https://github.com/ggerganov/llama.cpp/blob/928e0b7013c862cf10701957b3d654aa70f11bd8/examples/server/server.cpp#L3650) but the existing code for the server doesnt include the option to send embeddings for [completions](https://github.com/ggerganov/llama.cpp/blob/928e0b7013c862cf10701957b3d654aa70f11bd8/examples/server/server.cpp#L3397C1-L3398C1) ....
Not yet. Currently testing it on a personal kube cluster with separate node selectors.
@phymbert ive made a pull request.