nv-ingest
nv-ingest copied to clipboard
[DOC]: Add "blueprint" diagram and explain
How would you describe the priority of this documentation request
Significant improvement
Please provide a link or source to the relevant docs
README.md
Describe the problems in the documentation
The README text does not make it clear how parts of the architecture diagram fit together and how NVIDIA NIMs are used. We recommend that the diagram also be explained for clarity.
Diagram:
(Optional) Propose a correction or improvement
No response
@randerzander further, we should add descriptions as the following to make it easier to understand and digest how the architecture is coming together:
PDF Ingestion NIM microservices
- nv-yolox-structured-image: A fine-tuned object detection model to detect charts, plots, and tables in PDFs.
- Deplot: A popular community pix2struct model for generating descriptions of charts.
- CACHED: An object detection model used to identify various elements in graphs.
- PaddleOCR: An optical character recognition (OCR) model to transcribe text from tables and charts.
- NVIDIA NeMo Retriever NIM microservices
- nv-embedqa-e5-v5: A popular community base-embedding model optimized for text question-answering retrieval.
- nv-rerankqa-mistral4b-v3: A popular community base model fine-tuned for text reranking for high-accuracy question answering.
- For more information, see An Easy Introduction to Multimodal Retrieval-Augmented Generation.