table-transformer icon indicating copy to clipboard operation
table-transformer copied to clipboard

Finetuning Dataset Annotation Format

Open bely66 opened this issue 2 years ago • 2 comments

Hi Everyone, I was finetuning a TSR dataset of 487 Tables, The tables are different from the PubTabMed Dataset.

At first I annotated the Dataset using a normal annotation where bounding boxes cover the whole table and the whole columns and rows.

Which was different from the original PubTabMed annotation where the borders touch the text.

In this case the model score was: AP50: 0.794, AP75: 0.458, AP: 0.472, AR: 0.627

I found that this score was very low so what I did was that I changed the annotation to match the PubTabMed dataset and ended up with a score of: AP50: 0.705, AP75: 0.348, AP: 0.371, AR: 0.531

Which is much worse in terms of everything

Why is that happening, how can I fix it, and what do I need to look for to make sure that things are running well?

bely66 avatar Jul 06 '23 11:07 bely66

@bely66 hi, could you please tell me how did you prepare your datas for fine-tuning? I also want to do some fine-tune jobs on structure models but don't know how to prepare my own dataset. Looking forwarding to your reply.

YingxuanW avatar Sep 06 '24 02:09 YingxuanW

@bely66 hi, could you please tell me how did you prepare your datas for fine-tuning? I also want to do some fine-tune jobs on structure models but don't know how to prepare my own dataset. Looking forwarding to your reply.

+1 能加个联系方式?

dreamlychina avatar Nov 12 '24 09:11 dreamlychina