instinct.cpp icon indicating copy to clipboard operation
instinct.cpp copied to clipboard

Adanvanced RAG implementations

Open RobinQu opened this issue 1 year ago • 4 comments

For better evaluation result in HF QA dataset.

  • Reranking: BCE, BGE-M3 scoring
  • Query rewrite
    • to generate SQL filter for given prompt
    • to generate hyperthecial queries

RobinQu avatar Apr 03 '24 14:04 RobinQu

more complex rag pipeline may invole agent frameworks #18

Adaptive RAG

Image

https://github.com/langchain-ai/langgraph/blob/70c1c996a4c9fe8df518bcd849b3c6453dd0d58b/examples/rag/langgraph_adaptive_rag.ipynb

RobinQu avatar Apr 05 '24 05:04 RobinQu

Rerank, Colbert, ...

Related work

Methods

  • Keyword based - BM25
  • Cross-encoders BCE reranker
  • Late-interaction - ColBERT, BGE-M3

Opensourced Projects

  • Official COLBERT implementation: https://github.com/bclavie/RAGatouille
  • BCE family: https://github.com/netease-youdao/BCEmbedding
  • BAAI BGE family: https://github.com/FlagOpen/FlagEmbedding
  • BCE & BGE in C++: https://github.com/li-plus/chatglm.cpp

image

image

image

Conlcusion

  • Both BCE and BGE family can be regarded as SOTA.
  • For multilanguage use case, valillan RAGatouille lags behind.
  • As Late-interaction models are faster in inference. bge-m3 is prefered as first ranking methods in RAG pipeline.

RobinQu avatar May 20 '24 07:05 RobinQu

Timeline

  • #21
    • [x] experiemntes with BERT + BGE M3 model for better understanding of forward pass, ~3 days~ , 1 day.
    • [x] bge-m3 inference code. A standalone repo may be needed. ~7 days~, 3 days.
  • #20
    • [x] file and filebatch controllers, services, mappers. 3 days.
    • [x] duckdb vectordb operator, retriever factory, 2 days
    • [x] DuckDB: search with filter, delete with filter, 1 day
    • [x] summarization pipeline for files, 2 days
    • [x] file object handler, 2 days
    • [x] file batch background jobs to update its progress, 1 day
    • [x] search tool 3 days.
    • [ ] (optional) parallel react agent executor, 2 days
    • [x] integration test 3 days.
    • [ ] eval test HF QA datast. 3 days.

RobinQu avatar May 21 '24 01:05 RobinQu

OpenAI officials parameter for RAG: https://platform.openai.com/docs/assistants/tools/file-search/how-it-works

By default, the file_search tool uses the following settings: Chunk size: 800 tokens Chunk overlap: 400 tokens Embedding model: text-embedding-3-large at 256 dimensions Maximum number of chunks added to context: 20 (could be fewer)

Supported file formats: https://platform.openai.com/docs/assistants/tools/file-search/supported-files

Image

RobinQu avatar May 27 '24 10:05 RobinQu