host_ip causes an error in ChatQnA AIPC deployment with Ollama
Priority
P2-High
OS type
Ubuntu
Hardware type
AI-PC
Installation method
- [ ] Pull docker images from hub.docker.com
- [ ] Build docker images from source
Deploy method
- [X] Docker compose
- [ ] Docker
- [ ] Kubernetes
- [ ] Helm
Running nodes
Single Node
What's the version?
Description
In GenAIComps, ollama is tested with localhost:
curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?" }'
`
curl http://127.0.0.1:9000/v1/chat/completions -X POST -d '{"model": "llama3", "query":"What is Deep Learning?","max_new_tokens":32,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' -H 'Content-Type: application/json'
` which works fine. But in GenAIExamples - instead of localhost, host_ip is used which causes an error (Connection refused)
Reproduce steps
cd GenAIExamples/ChatQnA/docker/aipc
docker compose up -d
run ollama
curl http://${host_ip}:9000/v1/chat/completions
-X POST
-d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}'
-H 'Content-Type: application/json'
Raw log
[No response](curl: (7) Failed to connect to x.x.x.x port 11434 after 0 ms: Connection refused)
@devpramod can you add a meaningful title to the issue? Do you have host_ip set?
@dcmiddle Yes all other services work fine with host_ip For Ollama, both the backend LLM service (i.e. Ollama itself) and the LLM microservice don't work with host_ip, need to set to localhost
@devpramod Have you set the Ollama service?
Set Up Ollama LLM Service
Install Ollama service with one command
curl -fsSL https://ollama.com/install.sh | sh
Set Ollama Service Configuration
Ollama Service Configuration file is /etc/systemd/system/ollama.service. Edit the file to set OLLAMA_HOST environment (Replace ${host_ip} with your host IPV4).
Environment="OLLAMA_HOST=${host_ip}:11434"
Set https_proxy environment for Ollama if your system access network through proxy.
Environment="https_proxy=http://proxy.ims.intel.com:911"
Restart Ollam services
$ sudo systemctl daemon-reload
$ sudo systemctl restart ollama.service
Pull LLM model
#export OLLAMA_HOST=http://${host_ip}:11434
#ollama pull llam3
#ollama lists
NAME ID SIZE MODIFIED
llama3:latest 365c0bd3c000 4.7 GB 5 days ago
Submit PR #874 to update AIPC README.
@xiguiw I'm able to use curl http://${host_ip}:11434/api/generate -d '{"model": "llama3", "prompt":"What is Deep Learning?"}' successfully, thanks for the solution.
I faced one issue though:
after running sudo systemctl restart ollama.service I got the following error:
Error: could not connect to ollama app, is it running?
Then, I had to open a different terminal and use command ollama serve. Then it worked.
@xiguiw I'm able to use
curl http://${host_ip}:11434/api/generate -d '{"model": "llama3", "prompt":"What is Deep Learning?"}'successfully, thanks for the solution.I faced one issue though: after running
sudo systemctl restart ollama.serviceI got the following error:Error: could not connect to ollama app, is it running?Then, I had to open a different terminal and use command
ollama serve. Then it worked.
@devpramod Not sure what's the problems. Here are some possibilities:
- If there are changes in the service configuration that aren't being recognized, try reloading the systemctl daemon:
sudo systemctl daemon-reload
sudo systemctl restart ollama.service
- To restart service, make sure ollama service is started
sudo systemctl status ollama.service
sudo systemctl start ollama.service
@devpramod, please follow the latest code and readme to build images and run docker compose. It should work well as we just verified recently.
@devpramod, Is your issue fixed by latest code? Can we close this issue?
@lvliang-intel I have verified the fix. Thanks.