Bozhi Luan
Results
2
comments of
Bozhi Luan
Thank you for your interest in our work. The prepare_stage2_question.py file corresponds to the second ablation experiment in the paper, which involves only the ground operation. In this experiment, the...
In LLaVA's TextVQA evaluation, OCR data was incorporated into the textual questions. We conducted evaluation experiments according to MultimodalOCR (https://arxiv.org/abs/2305.07895) (https://github.com/Yuliang-Liu/MultimodalOCR). In their TextVQA evaluation dataset, OCR data wasn't utilized....