GenAIExamples host_ip causes an error in ChatQnA AIPC deployment with Ollama

Priority

P2-High

OS type

Ubuntu

Hardware type

AI-PC

Installation method

[ ] Pull docker images from hub.docker.com
[ ] Build docker images from source

Deploy method

[X] Docker compose
[ ] Docker
[ ] Kubernetes
[ ] Helm

Running nodes

Single Node

What's the version?

25174c0

Description

In GenAIComps, ollama is tested with localhost: curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?" }' ` curl http://127.0.0.1:9000/v1/chat/completions -X POST -d '{"model": "llama3", "query":"What is Deep Learning?","max_new_tokens":32,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' -H 'Content-Type: application/json'

` which works fine. But in GenAIExamples - instead of localhost, host_ip is used which causes an error (Connection refused)

Reproduce steps

cd GenAIExamples/ChatQnA/docker/aipc

docker compose up -d

run ollama

curl http://${host_ip}:9000/v1/chat/completions
-X POST
-d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}'
-H 'Content-Type: application/json'

Raw log

[No response](curl: (7) Failed to connect to x.x.x.x port 11434 after 0 ms: Connection refused)

Aug 30 '24 17:08 devpramod

@devpramod can you add a meaningful title to the issue? Do you have host_ip set?

Sep 13 '24 17:09 dcmiddle

@dcmiddle Yes all other services work fine with host_ip For Ollama, both the backend LLM service (i.e. Ollama itself) and the LLM microservice don't work with host_ip, need to set to localhost

Sep 13 '24 17:09 devpramod

@devpramod Have you set the Ollama service?

Set Up Ollama LLM Service

Install Ollama service with one command

curl -fsSL https://ollama.com/install.sh | sh

Set Ollama Service Configuration

Ollama Service Configuration file is /etc/systemd/system/ollama.service. Edit the file to set OLLAMA_HOST environment (Replace ${host_ip} with your host IPV4).

Environment="OLLAMA_HOST=${host_ip}:11434"

Set https_proxy environment for Ollama if your system access network through proxy.

Environment="https_proxy=http://proxy.ims.intel.com:911"

Restart Ollam services

$ sudo systemctl daemon-reload
$ sudo systemctl restart ollama.service

Pull LLM model

#export OLLAMA_HOST=http://${host_ip}:11434
#ollama pull llam3
#ollama lists
NAME            ID              SIZE    MODIFIED
llama3:latest   365c0bd3c000    4.7 GB  5 days ago

Sep 26 '24 01:09 xiguiw

Submit PR #874 to update AIPC README.

Sep 26 '24 02:09 xiguiw

@xiguiw I'm able to use curl http://${host_ip}:11434/api/generate -d '{"model": "llama3", "prompt":"What is Deep Learning?"}' successfully, thanks for the solution.

I faced one issue though: after running sudo systemctl restart ollama.service I got the following error:

Error: could not connect to ollama app, is it running?

Then, I had to open a different terminal and use command ollama serve. Then it worked.

Sep 27 '24 14:09 devpramod

@xiguiw I'm able to use curl http://${host_ip}:11434/api/generate -d '{"model": "llama3", "prompt":"What is Deep Learning?"}' successfully, thanks for the solution.

I faced one issue though: after running sudo systemctl restart ollama.service I got the following error:
Error: could not connect to ollama app, is it running?
Then, I had to open a different terminal and use command ollama serve. Then it worked.

@devpramod Not sure what's the problems. Here are some possibilities:

If there are changes in the service configuration that aren't being recognized, try reloading the systemctl daemon:

sudo systemctl daemon-reload
sudo systemctl restart ollama.service

To restart service, make sure ollama service is started

sudo systemctl status ollama.service
sudo systemctl start ollama.service

Sep 29 '24 04:09 xiguiw

@devpramod, please follow the latest code and readme to build images and run docker compose. It should work well as we just verified recently.

Oct 23 '24 13:10 lvliang-intel

@devpramod, Is your issue fixed by latest code? Can we close this issue?

Nov 03 '24 10:11 lvliang-intel

@lvliang-intel I have verified the fix. Thanks.

Nov 05 '24 15:11 devpramod