wandering-walrus

Results 3 comments of wandering-walrus

so the pipeline for inference with segment-level positional features would be to run it through the layoutlmv3 finetuned on publaynet, modify the ocr results with those text segment positions receieved...

Looks like I submitted this too soon. It looks like the word-level positions were programmatically adjusted to be segment-level positions based on the labels. Is the recommendation to do this...

Thanks, @NielsRogge. I haven't been able to find much regarding this topic. Do you think layoutlmv3 finetuned on the publaynet dataset could be used for this?