xuankai@cmu comments

Results 46 comments of


                                            xuankai@cmu

Question about asr2.sh and its options to reproduce the librispeech_100 recipe.

Hi @YoshikiMas, thanks for noticing the problem. Yeah, I agree with the suggested change in stage 5. For the `lm_train_text`, we can simply use `data/${train_set}/text`, the same as what is...

Question about asr2.sh and its options to reproduce the librispeech_100 recipe.

Yes, exactly!

egs2/TEMPLATE/asr1/asr.sh not passing feats_normalize flag to espnet2.bin.asr_train

Hi @zqwang7 , in (1), it is not supposed to pass normalize flag, because it is the collect stats. I assume the error (2) is because no stats were generated...

egs2/TEMPLATE/asr1/asr.sh not passing feats_normalize flag to espnet2.bin.asr_train

@zqwang7 [line 66](https://github.com/espnet/espnet/blob/2aa734d65013a0b33a6f8cb59b22159a87360eb8/egs2/chime4/asr1/conf/tuning/train_asr_transformer_wavlm_lr1e-3_specaug_accum1_preenc128_warmup20k.yaml#L66) uses `extract_feats_in_collect_stats`.

Having output discrepancy in stage 7 of multi-task enh model

Hi @minamo817 , in the training stage, the number of outputs was not checked. So you can successfully complete training (stage 6). But in stage 7 during inference, it is...

SSL w/o torchaudio dependency

> We could do all of this within a single module, but I figured breaking it up would be easier to read/navigate. Do you have any recommendations? I got it....

TCPGen implementation

@BriansIDP You need to follow stage 14 & 16 in `asr.sh`.

TCPGen implementation

@BriansIDP Another suggestion: it may be better to update the huggingface repo name. If you notice other pretrained checkpoints, you can at least find the name follows the pattern: contributer_dataset_expname.

Precompute ASR model features

Hi @cyaaronk , thanks for the updates. I'm thinking that if it is only for asr models, maybe you can put it in `egs2/TEMPLATE/asr1/pyscripts/utils`. Then you may modify the `asr.sh`...

Lora finetune

@Yuanyuan-888 The quick fix is to try an earlier version of whisper, `20230308`. Whisper has changed their tokenizer API.