Bozhi Luan comments

Repositories
Issues
Comments

Results 2 comments of


                                            Bozhi Luan

prepare_stage2_question.py cannot be found.

Thank you for your interest in our work. The prepare_stage2_question.py file corresponds to the second ablation experiment in the paper, which involves only the ground operation. In this experiment, the...

Results on the TextVQA benchmark

In LLaVA's TextVQA evaluation, OCR data was incorporated into the textual questions. We conducted evaluation experiments according to MultimodalOCR (https://arxiv.org/abs/2305.07895) (https://github.com/Yuliang-Liu/MultimodalOCR). In their TextVQA evaluation dataset, OCR data wasn't utilized....