GLM
GLM copied to clipboard
The multi-task learning setting is different from the original paper
According to the GLM paper, multi-task learning has two way, one is a mixture of the blank infilling object and the sentence-level objective, another is a mix of the blank infilling object and the document-level objective. But when I read the pre-training code (/config/ds_block_chinese.sh), I found that multi-task learning takes 40% blank infilling object, 30% sentence-level objective, 30% document-level objective. Am I understanding it wrong?
...
gpt_options=" \
--block-lm \
--task-mask \
--bert-prob 0.4 \
--gap-sentence-prob 0.3 \
...
The two ways of multi-task learning are designed for ablation study of different objectives. To enable adaption to different downstream tasks, we mix all the three types of objectives in the Chinese model.