About prepro and MMI training
I have two questions about training reversed model.
The first one is about training data. I can't see objective reason why prepro.py cuts off big part of training data. I just realized that almost all samples wich have only 1 sentencte in source are cutted of due to _make_feature function work. Mor specificif all(w == 0 for w in ws[1:]): return None. I use --reverse parameter when prepearing data.
The second question is about validation data. If we train forward model, it's obviously that we need smth like src1<eos>src2 \t tgt but how it should look when we train backward model? My assumption was tgt \t src2 <eos> src1 due to inputs = list(reversed(inputs)), but the model's performance is very poor while training, and the quality on such validation set stops increasing after very small amount of training steps.
Thanks in advance.