agentscope
agentscope copied to clipboard
Reformat and improve RAG module and agents
Description
Updates
Changes on code structure
- migrate and reformat RAG/knowledge module(s) and RAG agent(s) from examples to a module in src
- add
llama-indexasrag_requiresinsetup.py
Changes on the RAG agent module
- be compatible with the new
KnowledgeBankfeature - the configurations for the RAG-related functionalities are relocated back to knowledge modules
- the retrieve method merges the retrievers from the
KnowledgeBankmembers
Changes on the RAG/knowledge module
- Rename the RAG modules to Knowledge (e.g., LlamaIndexRAG -> LlamaIndexKnowledge)
- store and persist processed embeddings/indices/documents
- support loading multiple doc types and dirs for one index
- support docs management in the obtained (persisted) index
- add a refresh function to update the index when needed
- enable agents to reset or add new retrievers
Improving utility of knowledge module
- reformat easy-to-use knowledge module config: the new format only configure the
KnowledgeBank - introduce
KnowledgeBank:-
KnowledgeBankprovides an easier way to initialize a knowledge object, just calladd_data_as_knowledgewithknowledge_id(a string as the identifier for this knowledge object),emb_model_name(the name of the embedding model config) anddata_dirs_and_types(a dictionary of data directories and the wanted file extensions). As shown in therag_example.pyknowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", emb_model_name="qwen_emb_config", data_dirs_and_types={ "../../docs/sphinx_doc/en/source/tutorial": [".md"], }, ) - Knowledge objects in
KnowledgeBankcan be shared and duplicated by multiple agents, which can avoid embedding duplicated documents. - RAG agents can load multiple Knowledge objects (based on the
"knowledge_id"inknowledge_config.json) with associated retrievers to perform multi-source information retrieval. Just need to pass the agent intoKnowledgeBank.equipfunction.
-
Toturial
Both English and Chinese tutorial are added as 209-rag.md .
Checklist
Please check the following items before code is ready to be reviewed.
- [x] Code has passed all tests
- [x] Docstrings have been added/updated in Google Style
- [x] Documentation has been updated
- [x] Code is ready for review
@ZiTao-Li Is this PR ready for review?
The ImportError of LlamaIndex library is still exposed to users who don't use RAG module.