Zengwei Yao
Zengwei Yao
This PR implements LSTM model. See https://github.com/k2-fsa/icefall/pull/479 for details.
Thanks for sharing your codes. I'm wondering if this enc_dim is 64 (as in your paper) or 256 (as in the code page https://github.com/ujscjj/DPTNet/blob/master/dpt_net.py).
This PR aims to update the DoubleSwish function. It just replaces `x * sigmoid(x-1)` with `x * sigmoid(x-1) - 0.05x`. Experimental results on Dan's Zipformer training on `train-clean-100` show that...
This PR adds a recipe of LSTM model for AISHELL dataset. Refer to https://github.com/k2-fsa/icefall/pull/479 and https://github.com/k2-fsa/icefall/pull/564
This PR aims to apply latency penalty on streaming ScaledLSTM model, to decrease the symbol delay. Gradient filter is applied inside the lstm module to prevent training instability. Related PR:...
This PR adds the gradient filter for tdnn_lstm_ctc recipe. You could see https://github.com/k2-fsa/icefall/pull/564 for details.
This PR aims to clip the rnn gradients in a chunk-wise manner, to solve the gradient explosion problem in the backward pass. When computing each chunk, we clip the gradients...
This PR uses CTC as an auxiliary loss for the streaming LSTM (see https://github.com/k2-fsa/icefall/pull/479) transducer model, following https://github.com/k2-fsa/icefall/pull/477.
This PR supports CTC/AED system for `zipformer` recipe. * CTC/AED results on LibriSpeech, trained for 50 epochs (--ctc-loss-scale=0.1, --attention-decoder-loss-scale=0.9), decoding method: sample 100-best paths from CTC lattice and rescore with...