pytorch-seq2seq Scheduled Sampling

In the scheduled sampling paper, it is mentioned that if we try to train by tossing coin and deciding whether to provide predicted output for the whole sequence or not it performs worse. Instead one should choose to provide correct token or not at each time step. (see p3. footnote in the paper). Yet in the decoder, teacher forcing is either enabled for the whole sequence or not, I don't think that would work.

Jul 02 '18 20:07 umgupta

You might be right, but the teacher forcing here could really improve the performance of 1 ~ 2 point~

Jul 12 '18 11:07 AtmaHou

@AtmaHou Did you mean the kind of teacher forcing that is implemented here? I tried that and it actually doesn't improve the perf (in agreement with scheduled sampling paper)

Jul 12 '18 20:07 umgupta

Yep~~ You could have a try to tune the teacher forcing rate (default 0), 0.5 is worth trying. I found both 0 and 1 are not helping. emmmmm..... From my point of view, scheduled sampling is just a trick to enable model to see its own output with a random rate, and both the two method achieve this.

Jul 15 '18 13:07 AtmaHou

@AtmaHou My experience has not been good with this kind of teacher forcing for non-trivial tasks so far. It worsens my result sometimes. Scheduled sampling method works better though.

Since this repo has so many stars and at one point I was using as a ref implementation, I thought I should point it out.

Jul 15 '18 17:07 umgupta

@umgupta Ha~Your post has also deepened my understanding of teacher forcing. Maybe I should implement a kind of teacher forcing that you pointed out, which can further enhance my model effect.

Jul 18 '18 09:07 AtmaHou

@AtmaHou Sure do so and let me know :).

Also, I am fairly new to learning sequences. Do you happen to know some toy problem to compare/judge the sanity of the algorithm (like mnist for images)? The one in this repo to learn to reverse is too trivial. (Too trivial because any kind of teacher forcing works ok, or even if you make some mistake in code it worked with good result)

Jul 18 '18 09:07 umgupta

@umgupta Machine translation problem in pytorch tutorial is quite simple, which might satisfy you.

Jul 18 '18 10:07 AtmaHou