Document Classification Mertic (UDOP)
Hello, Thank you very much for sharing the code for this amazing work. I have been trying to reproduce and evaluate doc cls. I noticed that you encode the class names and those represent your labels. However, this creates sequences of 2-3 tokens. How do you use it to evaluate the accuracy?
Thank you very much
I manually decoded and mapped the result string to class labels (which does not seems intuitive). However, the results where sometimes close but not exact, here's some examples which the model "failed":
news
advertisementwritten
news
scientific
formwritten
presentationwritten
news
news
file
scientific
file report
news
news
file
scientific
email report
file
presentationwritten
file report
scientific
scientific
scientific
hand
advertisement article
news
scientific
form folder
news
file
presentationwritten
scientific
scientific
scientific article
presentation report
presentation article
news
scientificwritten
file
scientificwritten
hand
scientific
How do you deal with such cases?
Another things, the checkpoints you supplied are finetuned in RVL-CDIP? because it seems that you guys use it for the example_io notebook.
Thanks
for example, if scientificwritten results in 2-3 tokens, then the evaluation will be exact match of these tokens. the model should be able to predict all the subtokens and will evaluates to correct if all tokens match