Zilong Wang comments

Results 22 comments of


                                            Zilong Wang

LayoutReader: Evaluation metric ARD

Yes, we first sum up the "RD" for each page in the testing set and then divide it by the number of pages in the testing set so the final...

[layoutreader] can I use the [layoutlmv3-base-chinese] for reading order detection task?

The LayoutReader uses the weight of LayoutLM(v1) as the initial states and is further trained on the ReadingBank for the reading order detection. Theoretically, it is possible to integrate the...

[LayoutReader] eval result not aligned with readme

Hi! Thanks for your interest in our work! We are sorry to see that you couldn't produce the supposed results as we expected. All results reported in our paper are...

Unclear input data structure of layoutreader

@SimeonZhang Hi, it is easy to run LayoutReader in `text only` settings by using the corresponding `model_name` and `model_name_or_path` in args. You can also run the `layout only` settings with...

Unclear input data structure of layoutreader

@ManuelFay Thank you for the great job! As for your question, we run LayoutReader on ReadingBank dataset and we filter out the pages with more than 511 tokens. For other...

Unclear input data structure of layoutreader

Actually, the [load_and_cache_line_order_examples](https://github.com/microsoft/unilm/blob/e4929f812398207b7fefb4dda6e9debcb8ce34b9/layoutreader/s2s_ft/utils.py#L339) is deprecated. You can reproduce a similar function if you need to conduct such experiments.

Zilong Wang

LayoutReader: Evaluation metric ARD

[layoutreader] can I use the [layoutlmv3-base-chinese] for reading order detection task?

[LayoutReader] eval result not aligned with readme

Unclear input data structure of layoutreader

Unclear input data structure of layoutreader

Unclear input data structure of layoutreader

[layoutreader] Recommended good labeling tools for reading order detection

End-to-End OCR with LayoutReader

[LayoutReader] Training loss is low but inference performs terrible

[LayoutReader] Training loss is low but inference performs terrible