pytorch-seq2seq
pytorch-seq2seq copied to clipboard
An open source framework for seq2seq models in PyTorch.
Hi, I don't understand why the teacher forcing is being done per the whole sequence. The definition of the teacher forcing claims that at each timestep, a predicted or the...
``` /pytorch-seq2seq/seq2seq/models/EncoderRNN.py", line 68, in forward embedded = self.embedding(input_var) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 479, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/sparse.py", line 113, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py",...
Compared to OpenNMT, why do we need this [block](https://github.com/IBM/pytorch-seq2seq/blob/master/seq2seq/models/TopKDecoder.py#L257) which handles the dropped sequences that see EOS earlier. (This is not there in their beam search implementation.) They are also...
Hi, It seems that Perplexity is normalized twice & norm_term of NLLLoss should be masked out as well.
Can somebody tell me what is the type of attention used in this lib? Because I checked against Bahdanau and Luong attentions and it doesn't look like neither or maybe...
Hi I'm using this framework on my dataset, everything works fine on CPU, but when I moved them to gpu, it had the error as following: `File "/home/ibm_decoder/DecoderRNN.py", line 107,...
Hi, I wonder if rnn.forward_step changes the order of (batch_size*self.k) dimension ? With the code about initializing sequence_scores:  and in each step:  It seems like sequence_scores is updated...
fix bug #185 #161 Expected object of backend CUDA but got backend CPU for argument #3 'index' when running `example.py`
https://github.com/IBM/pytorch-seq2seq/blob/f146087a9a271e9b50f46561e090324764b081fb/seq2seq/models/DecoderRNN.py#L105 I think .view(batch_size, output_size, -1) should be .view(batch_size, -1, output_size) or this line just makes no sense.
https://github.com/IBM/pytorch-seq2seq/blob/f146087a9a271e9b50f46561e090324764b081fb/seq2seq/models/TopKDecoder.py#L83 . I think teacher_forcing should not be present in beam decoding, since ground truth tokens are not known during inference.