The multi-task learning setting is different from the original paper

Open Aurora-slz opened this issue 3 years ago • 1 comments

According to the GLM paper, multi-task learning has two way, one is a mixture of the blank infilling object and the sentence-level objective, another is a mix of the blank infilling object and the document-level objective. But when I read the pre-training code (/config/ds_block_chinese.sh), I found that multi-task learning takes 40% blank infilling object, 30% sentence-level objective, 30% document-level objective. Am I understanding it wrong?

...
gpt_options=" \
       --block-lm \
       --task-mask \
       --bert-prob 0.4 \
       --gap-sentence-prob 0.3 \
...

Jul 13 '22 12:07 Aurora-slz

The two ways of multi-task learning are designed for ablation study of different objectives. To enable adaption to different downstream tasks, we mix all the three types of objectives in the Chinese model.

Jul 20 '22 04:07 duzx16