[Feature Request] Programmable pipeline for AskAsync
Context / Scenario
Actually if you use the AskAsync method of the memory the system will perform a vector search and use the most X relevant result to pass to LLM.
It would be interesting to customize this pipeline, so we can retrieve documents with other techniques, an example could be query expansion, re-ranking, using BM25 in parallel to vector search then rerank etc.
Ingestion pipeline is fully customizable, it would be fantastic if query part could be also customizable.
The problem
It is difficult to implmenent advanced techniques like re-reanking or query expansion.
Proposed solution
It could be nice that the AskAsync methods simply is changed to a pipeline with default component. Default solution is two stage, the first is vector search, the other takes document in order of vector search result. But we can change confguring different pipeline for more advanced techniques.
Importance
would be great to have
@dluc I'm trying to lay down an example I'm starting here https://github.com/alkampfergit/SemanticKernelPlayground/blob/feature/better_search/200_CSharpSemanticMemory/KernelMemorySamples/Samples/CustomPipelineBase.cs just a simple way to decouple search/query, next step I'll add re-ranker
+1 At minimum the ability to easily support rerankers would be highly beneficial.
I've done it here https://github.com/alkampfergit/KernelMemory.Extensions
If you want I've done a couple of video on a chain with Keyword + Vector -> reranking
You can find it here https://www.linkedin.com/feed/update/urn:li:activity:7195302978747060225/