rag-api-server
rag-api-server copied to clipboard
Support multi-pass RAG search
The current approach to search only the last user message for RAG content is too simplistic, especially in multi-turn conversations or in agentic apps where the agent automatically adds or re-phrases the last user message.
I think we need to combine the last 3 to 5 user messages together, and perform a second search pass. The highest scored vectors from both searches will be selected for the context.