Satheesh K

Results 10 comments of Satheesh K

Hi @vinayak-mehta , Even I thought of implementing this. [dramatiq](https://github.com/Bogdanp/dramatiq) or [celery](https://github.com/celery/celery) are my suggestions for asynchronous processing of pages.

what is the prompt you are using while inference? most probably it is messing up the json output. If you have fine-tuned base model for your task, try using ``...

did you fine-tune the model on your custom dataset? or trying to use the off the shelf `cord-v2` model?

I see some changes compared to the code in this repo. but still I suggest checking your prompt at inference time if you think edit distance very low during training...

@vinayak-mehta , Have you tried pdftoppm( poppler utils) for converting pdf to png.

Ok, In this [post](https://serverfault.com/questions/167573/fast-pdf-to-jpg-conversion-on-linux-wantedl), there was one more suggestion to do this with [MuPDF](https://mupdf.com/index.html)

Hi @WaterKnight1998 @mht-sharma , Do you have inference script for Donut document parsing model using encoder and decoder onnx models? Similar to this [TrOCR gist](https://gist.github.com/mht-sharma/f38c670930ac7df413c07327e692ee39)

I have used `convert_llama_hf_to_nemo.py` script to convert llama2 70B model from huggingface format to NeMo format. Here is the exact command ``` python3 -u /opt/NeMo/scripts/checkpoint_converters/convert_llama_hf_to_nemo.py --input_name_or_path=/workspace/llama2_models --output_path=/workspace/llama2_models/llama2-70b-base.nemo ```

I think it is straightforward to get text output using donut-base model. Load `naver-clova-ix/donut-base` from huggingface and use `` as prompt. ``` from donut import DonutModel import torch from PIL...