DialoGPT icon indicating copy to clipboard operation
DialoGPT copied to clipboard

About prepro and MMI training

Open liehtman opened this issue 5 years ago • 0 comments

I have two questions about training reversed model. The first one is about training data. I can't see objective reason why prepro.py cuts off big part of training data. I just realized that almost all samples wich have only 1 sentencte in source are cutted of due to _make_feature function work. Mor specificif all(w == 0 for w in ws[1:]): return None. I use --reverse parameter when prepearing data.

The second question is about validation data. If we train forward model, it's obviously that we need smth like src1<eos>src2 \t tgt but how it should look when we train backward model? My assumption was tgt \t src2 <eos> src1 due to inputs = list(reversed(inputs)), but the model's performance is very poor while training, and the quality on such validation set stops increasing after very small amount of training steps.

Thanks in advance.

liehtman avatar May 08 '20 09:05 liehtman