i-Code icon indicating copy to clipboard operation
i-Code copied to clipboard

Language Support

Open eschaffn opened this issue 2 years ago • 2 comments

Hi a few questions, does this model work only on English?

If so, what would it take to train it on another language or script type?

Would it need to be pretrained again using self-supervision and how expensive is the pre-training process computationally?

Thank you!

eschaffn avatar Jul 27 '23 15:07 eschaffn

+1

CheungZeeCn avatar Jul 31 '23 07:07 CheungZeeCn

UDOP currently is English-only. To fully replicate this work on multilingual domain, you will need multilingual documents with OCR annotations, and labeled multilingual document for supervised pretraining.

ziyi-yang avatar Aug 31 '23 23:08 ziyi-yang