CheungZee

Results 6 issues of CheungZee

Seems that we are masking out the signal from supervised dataset according to the readme, which is different from the paper called Confidence-based masking ?

File "/home/ana/data2/iz3/izhuomi_engines/libs/pystardict.py", line 405, in __init__ self.idx = _StarDictIdx(dict_prefix=filename_prefix, container=self) File "/home/ana/data2/iz3/izhuomi_engines/libs/pystardict.py", line 159, in __init__ matched_records = re.findall(record_pattern, self._file) File "/home/ana/data1/anaconda3/envs/pds/lib/python3.9/re.py", line 241, in findall return _compile(pattern, flags).findall(string) TypeError:...

**Describe** Model I am using (UniLM, MiniLM, LayoutLM ...): Is there any layoutLM pre-train code available that we can use it for custom data?

How to config synthdog for much more longer text, like total length about 1024-2048;

in config: "mae_checkpoint": "mae_models/mae_pretrain_vit_large_full.pth" in udop_dual: self.vision_encoder = mae_model(config.mae_version, config.mae_checkpoint, config.image_size, config.vocab_size, config.max_2d_position_embeddings) But I found no pretiraned weights for mae encoder. Is the pretrained mae encoder weights available now?...

### Finetuninng on RVLCDIP Download RVLCDIP first and change the path For OCR, you might need to customize your code ``` bash scripts/finetune_rvlcdip.sh # Finetuning on RVLCDIP ``` Q1. which...