Enable running ChatQnA example on Docker Kubernetes
One of OPEA goals is to empower developers to tackle GenAI on their laptops and/or single node settings. The genAI examples run on docker single node instances and for production environments, have been vetted on Kubernetes. Beng able to test on Kubernetes on a single node will make the transition from dev to prod even more smooth. https://docs.docker.com/desktop/kubernetes/
/assigntome
Welcome @arcyleung ! You could try using for say the ChatQnA example TinyLlama/TinyLlama-1.1B-step-50K-105b and for re-ranker a smaller model too.
If I understand correctly the task is to add an example to run the ChatQnA example locally, using minikube or kind?
I'm leaning more towards kind since there is already a docker-based deployment, and I'd just have to create another dockerfile to for kind to spin up on this single-node testing use case.
@arcyleung How are you doing with this issue?
I've gotten the deployment to work on minikube and will write up a readme with the steps shortly. The cloud instance I was using had ran out of credits, my laptop only has 16GB RAM so deploying with the xeon manifests didn't work with a pod going OOM, but the rest of services did appear to work:
I'll try to find another xeon machine in the meantime
@arcyleung Please place the PR URL here so I can close out this issue when this is complete.
Please see the following PR for the instructions: https://github.com/opea-project/GenAIExamples/pull/1058
@mkbhanda Could we close issue?
Thank you @arcyleung and @xiguiw !