Yifan Yang comments

Results 28 comments of


                                            Yifan Yang

Adding ILM beam search and decoding

Yeah, I have RNNLM trained on GigaSpeech but not in icefall style. https://huggingface.co/yfyeung/icefall-asr-gigaspeech-rnn_lm-2023-10-08

Adding ILM beam search and decoding

@AmirHussein96 I note that you modified `k2.rnnt_loss_pruned` in k2. Would you mind sharing your branch?

Fix Blankskip bug.

**1 1** 1 0 **1 1 1** 1 **1** 1 0 0 limit_lens, which is the maximum reduced frames for each utt, is [2, 1, 3] instead of [1, 1,...

Fix Blankskip bug.

> For example, if x_lens = [10, 12], y_lens = [1, 2], then T = 12, and limit_lens = T - y_lens = [11, 10] , where limit_lens[0] = 11...

[WIP] Add phone based train and decode for gigaspeech

> Hi, have you got any results with phone based models? I previously tried with librispeech and the result was worse than BPE. For pruned transducer I only got 4-5...

[WIP] Add phone based train and decode for gigaspeech

> Maybe sometime later. Not recently.

[NOT FOR MERGE] Add Blank Skip to Zipformer2

Thanks!

Fixed the issue of errors in fully silent sentences during evaluation.

@drawfish Thanks for your suggestion. This model is in the LibriSpeech, whose test set does not have entirely silent sentences. IMO, you should modify the code of the export model.

Prevent large values in conv module in wav2vec2_module.py in SSL recipe

> I'm moving the conversation here since the previous [PR](https://github.com/k2-fsa/icefall/pull/1500) was closed. > > I ran @yfyeung 's training command using the merged k2ssl codes, with the batch-size and world-size...

Prevent large values in conv module in wav2vec2_module.py in SSL recipe

> ```shell > --max-duration 300 \ > --accum-grad 4 \ > ``` The current gradient accumulation mechanism simulates multi-GPU setup. You can simulate my setup using 4 GPUs with `acc_grad`...