For better evaluation result in HF QA dataset.

Reranking: BCE, BGE-M3 scoring
Query rewrite
- to generate SQL filter for given prompt
- to generate hyperthecial queries

Apr 03 '24 14:04 RobinQu

more complex rag pipeline may invole agent frameworks #18

https://github.com/langchain-ai/langgraph/blob/70c1c996a4c9fe8df518bcd849b3c6453dd0d58b/examples/rag/langgraph_adaptive_rag.ipynb

Apr 05 '24 05:04 RobinQu

Rerank, Colbert, ...

Related work

Methods

Keyword based - BM25
Cross-encoders BCE reranker
Late-interaction - ColBERT, BGE-M3
- 2004.12832v2.pdf
- https://zhuanlan.zhihu.com/p/683483778

Opensourced Projects

Official COLBERT implementation: https://github.com/bclavie/RAGatouille
BCE family: https://github.com/netease-youdao/BCEmbedding
BAAI BGE family: https://github.com/FlagOpen/FlagEmbedding
BCE & BGE in C++: https://github.com/li-plus/chatglm.cpp

Conlcusion

Both BCE and BGE family can be regarded as SOTA.
For multilanguage use case, valillan RAGatouille lags behind.
As Late-interaction models are faster in inference. bge-m3 is prefered as first ranking methods in RAG pipeline.

May 20 '24 07:05 RobinQu

Timeline

#21
- [x] experiemntes with BERT + BGE M3 model for better understanding of forward pass, ~3 days~ , 1 day.
- [x] bge-m3 inference code. A standalone repo may be needed. ~7 days~, 3 days.
#20
- [x] file and filebatch controllers, services, mappers. 3 days.
- [x] duckdb vectordb operator, retriever factory, 2 days
- [x] DuckDB: search with filter, delete with filter, 1 day
- [x] summarization pipeline for files, 2 days
- [x] file object handler, 2 days
- [x] file batch background jobs to update its progress, 1 day
- [x] search tool 3 days.
- [ ] (optional) parallel react agent executor, 2 days
- [x] integration test 3 days.
- [ ] eval test HF QA datast. 3 days.

May 21 '24 01:05 RobinQu

OpenAI officials parameter for RAG: https://platform.openai.com/docs/assistants/tools/file-search/how-it-works

By default, the file_search tool uses the following settings: Chunk size: 800 tokens Chunk overlap: 400 tokens Embedding model: text-embedding-3-large at 256 dimensions Maximum number of chunks added to context: 20 (could be fewer)

Supported file formats: https://platform.openai.com/docs/assistants/tools/file-search/supported-files

May 27 '24 10:05 RobinQu

Adanvanced RAG implementations

Rerank, Colbert, ...

Related work

Methods

Opensourced Projects

Conlcusion

Timeline