kernel-memory [Feature Request] Programmable pipeline for AskAsync

Context / Scenario

Actually if you use the AskAsync method of the memory the system will perform a vector search and use the most X relevant result to pass to LLM.

It would be interesting to customize this pipeline, so we can retrieve documents with other techniques, an example could be query expansion, re-ranking, using BM25 in parallel to vector search then rerank etc.

Ingestion pipeline is fully customizable, it would be fantastic if query part could be also customizable.

The problem

It is difficult to implmenent advanced techniques like re-reanking or query expansion.

Proposed solution

It could be nice that the AskAsync methods simply is changed to a pipeline with default component. Default solution is two stage, the first is vector search, the other takes document in order of vector search result. But we can change confguring different pipeline for more advanced techniques.

Importance

would be great to have

Feb 29 '24 17:02 alkampfergit

@dluc I'm trying to lay down an example I'm starting here https://github.com/alkampfergit/SemanticKernelPlayground/blob/feature/better_search/200_CSharpSemanticMemory/KernelMemorySamples/Samples/CustomPipelineBase.cs just a simple way to decouple search/query, next step I'll add re-ranker

Mar 27 '24 15:03 alkampfergit

+1 At minimum the ability to easily support rerankers would be highly beneficial.

May 09 '24 19:05 OrionSeven

I've done it here https://github.com/alkampfergit/KernelMemory.Extensions

If you want I've done a couple of video on a chain with Keyword + Vector -> reranking

You can find it here https://www.linkedin.com/feed/update/urn:li:activity:7195302978747060225/

May 11 '24 05:05 alkampfergit