[DOC]: Add "blueprint" diagram and explain

Open randerzander opened this issue 1 year ago • 1 comments

How would you describe the priority of this documentation request

Significant improvement

Please provide a link or source to the relevant docs

README.md

Describe the problems in the documentation

The README text does not make it clear how parts of the architecture diagram fit together and how NVIDIA NIMs are used. We recommend that the diagram also be explained for clarity.

Diagram:

(Optional) Propose a correction or improvement

No response

Sep 03 '24 16:09 randerzander

@randerzander further, we should add descriptions as the following to make it easier to understand and digest how the architecture is coming together:

PDF Ingestion NIM microservices

nv-yolox-structured-image: A fine-tuned object detection model to detect charts, plots, and tables in PDFs.
Deplot: A popular community pix2struct model for generating descriptions of charts.
CACHED: An object detection model used to identify various elements in graphs.
PaddleOCR: An optical character recognition (OCR) model to transcribe text from tables and charts.
NVIDIA NeMo Retriever NIM microservices
nv-embedqa-e5-v5: A popular community base-embedding model optimized for text question-answering retrieval.
nv-rerankqa-mistral4b-v3: A popular community base model fine-tuned for text reranking for high-accuracy question answering.
For more information, see An Easy Introduction to Multimodal Retrieval-Augmented Generation.

Sep 20 '24 17:09 abeltre1