cmmclee
cmmclee
the same with #12
> > > @a312863063 Hi, how do you merge the original logits into a low-dimension vector? > > > > > > Hi, you can see how it maps the...
Did you figure out if ASR accuracy affects lip synthesis? I have tried several chinese ASR, such as 'jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn' and 'TencentGameMate/chinese-wav2vec2-large'. But the result has not been significantly improved. What...
> 给大家提供一个思路:因为asr模型提取出的特征是字的概率而非“语音”的概率,而中文字多、且ASR模型容易识别错,导致提取的特征较弱,所以将ASR模型识别出来的‘字”,转为与语音更相关的“拼音”,乃至于声母和韵母,能够对中文提取出更有效的特征,我的实现: [code](https://github.com/flyingshan/chinese_speech_feature_extraction),我的实验结果来说相比原来有所提升,希望对大家有帮助。 我试过您提供的方法,效果并没有提升。还有一点,这种方法对于多音字的情况,会产生新的误差。不知道您的实验效果怎样?有哪些我理解不当的地方?
@flyingshan 您能否提供一个 demo 视频呢?