multi layer RNN
coded multi layer RNN. Using 2 layers as default, can change "num_layers" to adjust number of layers when calling DecoderWithAttention. Please check.
Does it achieve improved results when compared to a single layer one? If yes, by what margin?
Does it achieve improved results when compared to a single layer one? If yes, by what margin?
Hi kmario23, the training is on-going, by now I saw clear different loss decreasing between 2-layer and 4-layer RNN (I can't trace back single layer RNN performance), but it's too early to tell relation of performance vs #layer on my working dataset.
Theoretically to capture highly hierarchical structure by just one layer is not optimal. So hopefully by integrating multilayer here we can enable users to test the model's power in another dimension on their own datasets.
Is there any update on this issue? Will it be merged? A very nice functionality, I would say.
Just a quick question related to how initial states for multi-layer LSTM are constructed: I am wondering why we need to initialize hidden/cell states of the LSTM by passing image representation through FC layer? Is it somehow the standard way of doing it?
Just a quick question related to how initial states for multi-layer LSTM are constructed: I am wondering why we need to initialize hidden/cell states of the LSTM by passing image representation through FC layer? Is it somehow the standard way of doing it?
That's how we pass the learned representation of the image by the CNN (encoder) to the RNN/LSTM (decoder). Usually the decoder dimension varies in size (i.e. a hyperparameter). So, to match the encoder dimension to the decoder dimension, we usually have to project the learned representation of image i.e. the encoder output, which is of dimension 2048 in this case, to the decoder dimension. Else it's not possible to feed the encoder output directly into the decoder.