Shi Xian comments

Results 30 comments of


                                            Shi Xian

Building intermediate models with a predefined vocabulary leads to "poison" error

@kpu A similar problem I met is like this when I use a 105G corpus with totally default settings. command i used: lmplz --prune 0 5 30 -o 3 <...

时间戳模型返回内容text和timestamp对应长度不一致

如果现在还有问题请重新提一次这个issue哈 sdk有关时间戳与后处理优化的更新可能已经解决了这个问题

片段中存在长时间的静音，识别结果的时间戳不准确

已经定位到问题，根据标点切句的结果在标点不准确的时候会导致子句包含中间静音。修复中。

AESRC dataset

Hi, i just comfirmed the access to get the dataset with data company DataTang. You can get the data by submitting the form in https://www.datatang.com/Opendatasets. The Chinese name of the...

AESRC dataset

you can share the result of your application here :)

用最新的版本，会出现NameError: name 'ClusterBackend' is not defined 报错

最新的代码应该没有这个问题了，如果有的话可以考虑重新安装你的python环境，因为它可能是依赖错误导致的：）

用最新的版本，会出现NameError: name 'ClusterBackend' is not defined 报错

可以重新安装一下python环境，使用最新的requirments一键安装

model.generate时，推理一半出现了list out of range问题

> 我在识别一句"答案是1.5"的音频时, 必然触发此异常. 实际上模型已经正确识别了该音频, 但是在后处理的时候好像和time stamp等内容没有严格对齐, 应该和1.5的后处理有关. 可以分享一下这句音频吗，我来修复一下

model.generate时，推理一半出现了list out of range问题

看到了，那个后处理函数的问题我会修一下，这里先关掉了

KeyError: 'asr-inference is not in the pipelines registry group auto-speech-recognition. Please make sure the correct version of ModelScope library is used.'

可以贴一下通过funasr推理时typeerror的详细信息吗，现在主要维护的是funasr的推理方式