Changxu Cheng

Results 15 comments of Changxu Cheng

[1] A ConvNet for the 2020s. [2] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs.

I also encountered the issue and finally fixed it. In my original code, I had an operation like ``` L = 0.0 for ...: L = L + a_torch_tensor L...

> Why did you guys chose to use the majority direction in the positive set? We assume that most relation predictions are right by a simple/coarse relation head in a...

Hi, I have re-implement training using only SynthText, and everything is ok after 2000 steps. ![image](https://github.com/AlibabaResearch/AdvancedLiterateMachinery/assets/31085521/ae490763-3185-42c7-a968-34a12891f977) Please refer to attention scaling operation [here](https://github.com/AlibabaResearch/AdvancedLiterateMachinery/tree/main/OCR/LISTER#model-checkpoints) if it was changed in your case....

Hi, The token type embeddings are the same for all tokens. It is remained as used in BERT/BROS. The `1D Seg. Rank Embeddings` and `1D Seg. BIE Embeddings` are exactly...

> I train the model without pretrained weights, the final results are as below [according to the paper, ser f1 should be 83.39%, re f1 should be 74.91%] In the...

We use both tasks to finetune the model, but the evaluation on each task is independent.

Hi, Manikanta, The two tasks are separately performed by two heads respectively. We only conducted multi-task finetuning.

Hi, Refer [here](https://github.com/AlibabaResearch/AdvancedLiterateMachinery/blob/main/DocumentUnderstanding/GeoLayoutLM/model/geolayoutlm_vie.py#L159) where visual and textual features are added for RE.

Yes, you may set the linking loss to 0 mannually, and ignore the linking part.