Inside-Deep-Learning icon indicating copy to clipboard operation
Inside-Deep-Learning copied to clipboard

Error in Chapter_6.ipynb

Open galleon opened this issue 4 years ago • 3 comments

When running in colab (using GPU) I got the following error in cell:

rnn_3layer = nn.Sequential( #Simple old style RNN 
  EmbeddingPackable(nn.Embedding(len(all_letters), 64)), #(B, T) -> (B, T, D)
  nn.RNN(64, n, num_layers=3, batch_first=True), #(B, T, D) -> ( (B,T,D) , (S, B, D)  )
  LastTimeStep(rnn_layers=3), #We need to take the RNN output and reduce it to one item, (B, D)
  nn.Linear(n, len(namge_language_data)), #(B, D) -> (B, classes)
)

#Apply gradient cliping to maximize its performance
for p in rnn_3layer.parameters():
    p.register_hook(lambda grad: torch.clamp(grad, -5, 5))

rnn_results = train_network(rnn_3layer, loss_func, train_lang_loader, val_loader=test_lang_loader, score_funcs={'Accuracy': accuracy_score}, device=device, epochs=10)

Error is:

/usr/local/lib/python3.6/dist-packages/torch/nn/utils/rnn.py in pack_padded_sequence(input, lengths, batch_first, enforce_sorted)

    242 
    243     data, batch_sizes = \
--> 244         _VF._pack_padded_sequence(input, lengths, batch_first)
    245     return _packed_sequence_init(data, batch_sizes, sorted_indices, None)
    246 
RuntimeError: 'lengths' argument should be a 1D CPU int64 tensor, but got 1D cuda:0 Long tensor

galleon avatar Feb 14 '21 18:02 galleon

Can you post which version of PyTorch you get with your Colab session and this error? And does this error also occur for you in Chapter 4?

The last few updates of PyTorch have annoyingly had some breaking changes that don't occur in the change log, so I may have missed a new one since I wrote this chapter.

EdwardRaff avatar Feb 14 '21 20:02 EdwardRaff

Yes, same error in Chapter 4 in:

packed_train = train_simple_network(rnn_packed, loss_func, train_loader, val_loader=test_loader, score_funcs={'Accuracy': accuracy_score}, device=device, epochs=20)

And on my colab, I am using PyTorch 1.7.0+cu101

galleon avatar Feb 15 '21 06:02 galleon

Ok, I think I know what is happening here. If you switch to 1.6.X it should go away. I'll get this fixed once I get to revisions of chapter 4 or 6 with the editors. If you go to the code:

class EmbeddingPackable(nn.Module):
    """
    The embedding layer in PyTorch does not support Packed Sequence objects. 
    This wrapper class will fix that. If a normal input comes in, it will 
    use the regular Embedding layer. Otherwise, it will work on the packed 
    sequence to return a new Packed sequence of the appropriate result. 
    """
    def __init__(self, embd_layer):
        super(EmbeddingPackable, self).__init__()
        self.embd_layer = embd_layer 
    
    def forward(self, input):
        if type(input) == torch.nn.utils.rnn.PackedSequence:
            # We need to unpack the input, 
            sequences, lengths = torch.nn.utils.rnn.pad_packed_sequence(input.cpu(), batch_first=True)
            #Embed it
            sequences = self.embd_layer(sequences.to(input.data.device))
            #And pack it into a new sequence
            return torch.nn.utils.rnn.pack_padded_sequence(sequences, lengths.to(input.data.device), 
                                                           batch_first=True, enforce_sorted=False)
        else:#apply to normal data
            return self.embd_layer(input)

and change return torch.nn.utils.rnn.pack_padded_sequence(sequences, lengths.to(input.data.device), batch_first=True, enforce_sorted=False) to return torch.nn.utils.rnn.pack_padded_sequence(sequences, lengths.cpu(), batch_first=True, enforce_sorted=False) it should run. It appears a breaking change happened and everyone is having fun https://github.com/pytorch/pytorch/issues/43227

EdwardRaff avatar Feb 16 '21 05:02 EdwardRaff