[Bug] OPEA ChatQnA RAG example (compose_vllm.yaml) does not work with vllm

Open yogeshmpandey opened this issue 1 year ago • 1 comments

Priority

P2-High

OS type

Ubuntu

Hardware type

Xeon-SPR

Installation method

[X] Pull docker images from hub.docker.com
[ ] Build docker images from source

Deploy method

[X] Docker compose
[X] Docker
[X] Kubernetes
[X] Helm

Running nodes

Single Node

What's the version?

https://hub.docker.com/r/opea/llm-vllm/tags

Description

OPEA ChatQnA RAG example does not work with vllm.

OPEA ChatQnA with RAG does not consider the retrieved content.

Root Cause: The vllm microservice connector file does not support retrieved content, rather it directly uses the input query to call the vllm model serving.

Reference : https://github.com/opea-project/GenAIComps/blob/main/comps/llms/text-generation/vllm-ray/llm.py

Reproduce steps

Run the compose file

docker compose -f compose_vllm.yaml up, The application comes up

Upload a file
Ask specific queries as per the uploaded document from the interface

We see that the results does not has the relevant content from the documents.
The issue is reproducible in helm , docker and k8s deployment

Raw log

No response

Sep 06 '24 08:09 yogeshmpandey

The llm.py link should be this one, not in the vllm-ray directory: https://github.com/opea-project/GenAIComps/blob/main/comps/llms/text-generation/vllm/langchain/llm.py

And this looks like an docker image issue and reassign to owner [email protected]

Sep 12 '24 02:09 yongfengdu

Fixed by PR https://github.com/opea-project/GenAIComps/pull/687

Nov 03 '24 10:11 lvliang-intel