kkelev
kkelev
Hi , after I train a vanilla transformer on wmt17zh-en, I try to translate the source of the training set(about 19m). I tried fairseq-generate and fairseq-interactive , but the speed...
hi, i want to train a NAT model for zh-en (about 260k) . I get about 30 BLEU on teacher model , but always overfit on student model There are...
Can you please provide the process code of BEA data
复现
您好,能否说下您在fairseq训练时的超参,我在WMT100w或30w数据集上的最好结果是21,我想看下我的超参哪里有问题
The following is the command and log of my use of fairseq (the same operation can work a month ago, but now I tried the fairseq version that worked before...