MIT PATEL

Results 8 comments of MIT PATEL

Hey @nik13, check this out https://github.com/huggingface/transformers/issues/19190. I modified little bit in the inference part and worked for me.

I have created notebooks on the LayoutLM training and inference. It can handle whole image as image is divided into 512 tokens. [Notebook](https://github.com/mit1280/Document-AI/blob/main/FineTuning_LayoutLMv3_Trainer_HF_DocLayNet.ipynb)

Hi @nikhilKumarMarepally, please check https://github.com/mit1280/Document-AI/blob/main/LayoutLMv3_Inference.ipynb you need to stack "input_ids", "attention_mask", "bbox". All are in list so first convert to tensor and then stack it. This will resolve issue.

Hi @nikhilKumarMarepally, for LayoutLmv3 training you need page text, bounding box - coordination and label and image. If you have data like https://guillaumejaume.github.io/FUNSD/ then you can use LayoutLM else please...

Hi @basteran, can you share your fine-tuned script if you can? I am getting same error as you > ValueError: The model did not return a loss from the inputs,...

Thanks @basteran for sharing your work. I think it would be great if we move to huggingface discussion. I think we will get more inputs there. Code looks almost similar...

@basteran there is none right now. I will create one and tag you there.

@basteran here you go https://discuss.huggingface.co/t/kosmos-fine-tuning/75691