Jean Du
Jean Du
@troyas fairseq-train: error: unrecognized arguments: --latency-weight-var 0.1 This type of error is easy to debug according to the error log:  After I compare the main-branch code and an older...
I meet the same problem in Hubert model saving. To produce, one can use the opensource hubert_base model, and run: ``` import fairseq, torch model,_,_=fairseq.checkpoint_utils.load_model_ensemble_and_task(['hubert_base_ls960.pt']) torch.save(model[0], 'hubert_model.pt') ``` Error message...
> > Sure but I only have Librispeech-100 configs at hand right now. > > Training conf: > > ```yaml > # general > batch_type: numel > batch_bins: 2000000 >...
遇到了同样的问题,我的arpa是用srilm构建的,中文是以字为建模单元,但是英文不是以字母为建模单元,而是用的BPE(空格是▁),加语言模型之后性能下降特别明显,不仅英文有问题,中文也有很大的问题。 我之前看到了https://github.com/Slyne/ctc_decoder/issues/9 这个issue,但并没有明确get到解决办法。 我现在想的是直接在构建arpa的时候去解决这个问题,就是直接用kenlm并且将语料的英文字母也用空格分开去构建arpa,但这就引入一个问题:原本英文中的空格怎么办呢?能想到的一个办法是先将与语料中的空格转成其他字符比如“|”构建好arpa,然后一遍识别结果中也将空格替换成“|”,再用arpa重打分,最后再把“|”换回空格,这个方法比较麻烦。 @Slyne 你这边有什么好的办法嘛?
I meet the same problem, too. In my 36k hours Chinese + English + Mandarin-English codeswitch data, I use u2++conformer, with config of ``` output_size: 256 # dimension of attention...
> I meet the same problem, too. In my 36k hours Chinese + English + Mandarin-English codeswitch data, I use u2++conformer, with config of > > ``` > output_size: 256...
I found out the reason of my case, I changed the default SpecaugConf to: ``` spec_aug_conf: num_t_mask: 2 num_f_mask: 2 max_t: 30 max_f: 20 ``` Because I use this config...
Hi, here is some additional information. When I use the smaller model config: ``` output_size: 256 # dimension of attention attention_heads: 4 linear_units: 2048 # the number of units of...
Hi,I meet the same problem when I'm using wenet_api python binding. If I set continuous_decoding with True, the segment's result before the Endpinting will be clear by `decoder_->ResetContinuousDecoding();` at https://github.com/wenet-e2e/wenet/blob/main/runtime/core/api/wenet_api.cc#L143...
> when i using make_shard_list.py to make shard for gigaspeech ,there is no wrong output but always some *.tar file generated by it can not readable. @agreatbush I meet the...