GenAIExamples icon indicating copy to clipboard operation
GenAIExamples copied to clipboard

[Bug] OPEA ChatQnA RAG example (compose_vllm.yaml) does not work with vllm

Open yogeshmpandey opened this issue 1 year ago • 1 comments

Priority

P2-High

OS type

Ubuntu

Hardware type

Xeon-SPR

Installation method

  • [X] Pull docker images from hub.docker.com
  • [ ] Build docker images from source

Deploy method

  • [X] Docker compose
  • [X] Docker
  • [X] Kubernetes
  • [X] Helm

Running nodes

Single Node

What's the version?

https://hub.docker.com/r/opea/llm-vllm/tags

Description

OPEA ChatQnA RAG example does not work with vllm.

OPEA ChatQnA with RAG does not consider the retrieved content.

Root Cause: The vllm microservice connector file does not support retrieved content, rather it directly uses the input query to call the vllm model serving.

Reference : https://github.com/opea-project/GenAIComps/blob/main/comps/llms/text-generation/vllm-ray/llm.py

Reproduce steps

  • Run the compose file

docker compose -f compose_vllm.yaml up, The application comes up

  • Upload a file

  • Ask specific queries as per the uploaded document from the interface

image

  • We see that the results does not has the relevant content from the documents.

  • The issue is reproducible in helm , docker and k8s deployment

Raw log

No response

yogeshmpandey avatar Sep 06 '24 08:09 yogeshmpandey

The llm.py link should be this one, not in the vllm-ray directory: https://github.com/opea-project/GenAIComps/blob/main/comps/llms/text-generation/vllm/langchain/llm.py

And this looks like an docker image issue and reassign to owner [email protected]

yongfengdu avatar Sep 12 '24 02:09 yongfengdu

Fixed by PR https://github.com/opea-project/GenAIComps/pull/687

lvliang-intel avatar Nov 03 '24 10:11 lvliang-intel