unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

Feat: weighted average table metrics

Open badGarnet opened this issue 1 year ago • 3 comments

This PR uses (number of actual table) weighted average instead of average without weights for table metrics.

  • pages where there are ground truth tables the weight is proportional to the number of ground truth tables in that page
  • pages where there are no ground truth tables but has predicted tables (false positive) are assigned as 1 table worth of weight for the whole page for calculating the mean value of table_level_acc
  • pages with false positive tables do not contribute to table structural or table content metrics

test

This PR updates the existing test for evaluating table metrics:

  • adds a second file with just 1 table vs. the existing file with 2 tables
  • test the weighted average is written to the report

badGarnet avatar Jul 03 '24 15:07 badGarnet

There is one scenario we need to account for. When there is 0 tables in ground truth file, and there were some false positives. I think it could be considered as weight=1 what do you think? Now the file will be not counted right?

plutasnyy avatar Jul 04 '24 09:07 plutasnyy

There is one scenario we need to account for. When there is 0 tables in ground truth file, and there were some false positives. I think it could be considered as weight=1 what do you think? Now the file will be not counted right?

good call; that makes sense

badGarnet avatar Jul 04 '24 19:07 badGarnet

There is one scenario we need to account for. When there is 0 tables in ground truth file, and there were some false positives. I think it could be considered as weight=1 what do you think? Now the file will be not counted right?

good call; that makes sense

@plutasnyy actually in the code we already filter down to only rows with non-zero "total_tables". If we intend to change that behavior it would be better we do that in a different PR since it changes the existing behavior on tallying tables

badGarnet avatar Jul 08 '24 15:07 badGarnet