rag-experiment-accelerator icon indicating copy to clipboard operation
rag-experiment-accelerator copied to clipboard

Implement semantic chunking within RAG EXP ACC

Open ritesh-modi opened this issue 1 year ago • 1 comments

Currently, RAGE provides different chunking methods. An additional semantic chunking method should be added which should be configurable.

ritesh-modi avatar Nov 04 '24 12:11 ritesh-modi

what's the current recommended way to create a custom chunker ? If I understand correctly, one would have do the following:

  1. In documentLoader.py -> add a new custom _FORMAT_PROCESSORS
  2. create a customLoader that follows the ouput format of structuredLoader, like this: docsList.append({str(uuid.uuid4()): {"content": doc.page_content, "metadata":doc.metadata}})

Is my understanding correct ? @quovadim , @ritesh-modi

FlorianPydde avatar Nov 13 '24 17:11 FlorianPydde