Alan Fang comments

Results 49 comments of


                                            Alan Fang

中文开源语音大模型计划

> 数据层面我们可以重点向国内的方言、少数民族语言、小语种做倾斜。方言部分，有个不成熟的想法，可以先基于这个网站的数据训练一个语种预测模型，再基于语种预测模型去筛选相关语种的方言数据，可行性待验证 https://zhongguoyuyan.cn/index

中文开源语音大模型计划

> > 你有兴趣贡献个pr吗（把wespeaker的加噪挪过来） > > 好的，我还没pr过，我学习一下怎么弄。。 wespeaker看了下是两种，混响和背噪，我这边试了短噪头尾padding也挺有效的，训练时候给个噪声token

LoRA support

> thx ! any numbers? I conducted experiments on my own data and will soon update the experiment results on the open-source data.

> 可以通过继承的方式，比如loara_conformer_encoder, lora_attention, 重写encoder 和attention，目录如下： > > * wenet/fintune/lora/encoder.py > * wenet/fintune/lora/attention.py > > 然后在init_model.py 里边初始化，这样对原始代码几乎无侵入，并且fintue的方式还有lora变种 adapter等方式，可以方便后续扩展 ok, it's a good idea

LoRA support

> hi @fclearner ，之前说LoRA还有个补丁要打，请问这个什么时候merge呢？其实就是encoder的参数传岔了。。。因为之前的一个commit，你可以自己修一下，我还没来得及验证代码。。周末看看

LoRA support

> hi @fclearner ，之前说LoRA还有个补丁要打，请问这个什么时候merge呢？ hello，我大概改了一版，但我这边训练有点问题，晚上看下： https://github.com/fclearner/wenet/tree/LoRA_fix_args

LoRA support

> hi @fclearner ，之前说LoRA还有个补丁要打，请问这个什么时候merge呢？调model.eval()的时候会报错，不用deepspeed能跑，感觉是初始化的时候deepspeed有点问题，应该是类似这个issue的问题：https://github.com/huggingface/alignment-handbook/issues/57

Whisper-large-v3训练后的CTC解码结果出现�字符

大佬，为啥CTC会有乱码，是因为whisper按字节建模，然后由于CTC的条件独立性假设，没有学到上下文信息吗？ > 正常现象，解决方案是 #2247

MOE支持多语种识别的问题

> I just read this article, its mixture-of-experts is to replace the ffn in each conformer block, because each ffn has two linear layers, if the first linear layer is...

MOE支持多语种识别的问题

> @Mddct，I can add a class if you feel it's necessary. @fclearner , from this google paper,He did not explore this area(parameter freezing or finetune), However, latest Mistral AI release...