Chidhambararajan
Chidhambararajan
You might have to finetune a model for your specific use case
There is no offical code snipped in donut to facilitate the same Try incorporating the suggestions in the link here https://spell.ml/blog/gradient-checkpointing-pytorch-YGypLBAAACEAefHs Let us know if it works
960x640 works, i guess that the input dimensions should be a multiple of 320 Going below this resolution might give junk, as the pdf might become pixelated and fonts not...
This issue discusses briefly regarding the same https://github.com/clovaai/donut/issues/37, however this gives out a confidence score for the whole json not for individual entities. This models predicts the whole json as...
Thanks for updating your comment, it looks fine now 🙌 As on now there is no direct method to extract confidence scores for specific fields But each and every predicted...
Due to an NDA I cannot exactly share my document classes, but donut's own internal train code performs this https://github.com/clovaai/donut/blob/master/train.py#L66-L70 If you look at this line , this line is...
https://github.com/clovaai/donut/blob/e6623ad56c0e9f12a426dab2d8b2d65a39d64689/donut/model.py#L159-L161 Can I change the pretrained tokenizer from "hyunwoongko/asian-bart-ecjk" to "hyunwoongko/asian-bart-en". The later one is an english only decoder from the same repo, would that do the trick? Cause I...
> > https://github.com/clovaai/donut/blob/e6623ad56c0e9f12a426dab2d8b2d65a39d64689/donut/model.py#L159-L161 > > > > Can I change the pretrained tokenizer from "hyunwoongko/asian-bart-ecjk" to "hyunwoongko/asian-bart-en". The later one is an english only decoder from the same repo, would...
Tried the above mentioned change but still observed other lang charecters in prediction during intermediate epochs `Prediction: Examination Examination Generation: Generation: General Examination: GENERAL EXamination: GENERAL APPEANANACE normal, pleasant, pleasant,...
Will try it out, thanks for the tip!