rag-experiment-accelerator
rag-experiment-accelerator copied to clipboard
Implement semantic chunking within RAG EXP ACC
Currently, RAGE provides different chunking methods. An additional semantic chunking method should be added which should be configurable.
what's the current recommended way to create a custom chunker ? If I understand correctly, one would have do the following:
- In documentLoader.py -> add a new custom _FORMAT_PROCESSORS
- create a customLoader that follows the ouput format of structuredLoader, like this: docsList.append({str(uuid.uuid4()): {"content": doc.page_content, "metadata":doc.metadata}})
Is my understanding correct ? @quovadim , @ritesh-modi