tuntun990606

Results 16 comments of tuntun990606

> 我也遇到了类似的问题,但是我是刚训练的,mfa生成的音节已经不是拼音了,是一些我看不懂的符号 你需要把音素加到,text/symbols里

> In my case dealing with AISHELL3 datasets, I encountered with the same problem, and I found out that the MFA output textgrid has 2 sections, one is "phones" which...

> when i synthesize chinese sentences the Phoneme Sequence is alaways {sp} and the voice in result is can not play. ![1669127829277](https://user-images.githubusercontent.com/118829555/203341584-d2a66570-9622-4c6d-b215-ec95c5c72bd8.png) ![1669127868553](https://user-images.githubusercontent.com/118829555/203341748-d3f73a65-93ea-4211-b10e-76651c23b1ff.png) try input jin1 tian1 ni3 chi1 le4...

原作给的应该是拼音作为原句 元辅音拆开作为音素 经过处理后的train.txt内容像这样:SSB05440297|SSB0544|{w uei4 j ia1 y i5}|wei4 jia1 yi5 但是如果使用mfa官方方法直接对拼音对其的话,由于mfa官方提供的拼音词典不正确,会导致textgrid里的phone全是spn。所以改用生成文字到音素,会避免这个错误。(不使用mfa提供的pinyin.dict 使用mfa提供的其他字典) train.txt文件的内容就会像这样:SSB01120099|SSB0112|{m ej˧˥ tʰ i˨˩˦ ʂ u˥˥ tɕ i˧˥ j ow˨˩˦ ʂ ə˧˥ n m ə˩}|媒 体 书 籍...

"ai˥˥","ai˥˩","ai˦","ai˧˥","ai˨","ai˨˩˦","ai˩","au˥˥","au˥˩","au˦","au˧˥","au˨","au˨˩˦","au˩","a˥˥","a˥˩","a˦","a˧˥","a˨","a˨˩˦","a˩","ei˥˥","ei˥˩","ei˦","ei˧˥","ei˨","ei˨˩˦","ei˩","e˥˥","e˥˩","e˦","e˧˥","e˨","e˨˩˦","e˩","f","i˥˥","i˥˩","i˦","i˧˥","i˨","i˨˩˦","i˩","j","k","kʰ","l","m","n","ou˥˥","ou˥˩","ou˦","ou˧˥","ou˨","ou˨˩˦","ou˩","o˥˥","o˥˩","o˦","o˧˥","o˨","o˨˩˦","o˩","p","pʰ","s","t","ts","tsʰ","tɕ","tɕʰ","tʰ","u˥˥","u˥˩","u˦","u˧˥","u˨","u˨˩˦","u˩","w","x","y˥˥","y˥˩","y˦","y˧˥","y˨","y˨˩˦","y˩","z̩˥˥","z̩˥˩","z̩˦","z̩˧˥","z̩˨","z̩˨˩˦","z̩˩","ŋ","ŋ̍˧˥","ɕ","ə˥˥","ə˥˩","ə˦","ə˧˥","ə˨","ə˨˩˦","ə˩","ɥ","ɻ","ʂ","ʈʂ","ʈʂʰ","ʐ","ʐ̩˥˥","ʐ̩˥˩","ʐ̩˦","ʐ̩˧˥","ʐ̩˨","ʐ̩˨˩˦","ʐ̩˩","ʔ","ow˥˩", "aj˨˩˦", "aw˥˥", "ej˥˩", "aw˧˥","aj˥˩", "ej˧˥", "ow˨˩˦","aw˨˩˦","ow˧˥", "ej˨˩˦", "aw˥˩", "aj˧˥", "aj˥˥","ow˥˥","ej˥˥","ow˥˥","ow˨" 这是所有音素

> 我是按照 https://montreal-forced-aligner.readthedocs.io/en/latest/first_steps/index.html#first-steps-align-pretrained 里面case3的流程跑的。 最开始是拿AISHELL-3的数据集重新训练了pinyin方式的音素词典和声学模型,但是同样的对齐模型,用自己采集的中文数据集训练后的效果不清楚,没AISHELL-3数据集训练的效果好,是不是因为pinyin的对齐模型没官方提供的其他模型对齐效果好呀? 我没用过自己跑的mfa,但听别人说自己训得速度会慢一点?

> > 原作给的代码,生成的lab文件是拼音的,但是如果使用mfa官方方法直接对拼音对齐的话,由于mfa官方提供的拼音词典不正确,会导致textgrid里的phone全是spn。 将生成lab文件的代码/preprocessor/preprocessor.py中的:text = text.split(" ")[1::2] 改成text = text.split(" ")[0::2] ,这样获得的lab是中文数据,再用mfa给的词典和声学模型对齐就可以了 > > 我安装你的方法跑了AISHELL3的数据集和自己采集的数据集,AISHELL3的语音合成效果要好,自己采集的数据集合成后听着很多杂音,请问这个是什么问题呢,自己的数据集本身也是在安静环境下采集并没有很多杂音的。 不好意思,我没尝试过使用别的数据集,是不是采集率的问题?(如果您之后解决了的话麻烦也回复我一下,我很想知道)

> > 原作给的应该是拼音作为原句 元辅音拆开作为音素 经过处理后的train.txt内容像这样:SSB05440297|SSB0544|{w uei4 j ia1 y i5}|wei4 jia1 yi5 但是如果使用mfa官方方法直接对拼音对其的话,由于mfa官方提供的拼音词典不正确,会导致textgrid里的phone全是spn。所以改用生成文字到音素,会避免这个错误。(不使用mfa提供的pinyin.dict 使用mfa提供的其他字典) train.txt文件的内容就会像这样:SSB01120099|SSB0112|{m ej˧˥ tʰ i˨˩˦ ʂ u˥˥ tɕ i˧˥ j ow˨˩˦ ʂ ə˧˥ n m ə˩}|媒 体...

> > 原作给的应该是拼音作为原句 元辅音拆开作为音素 经过处理后的train.txt内容像这样:SSB05440297|SSB0544|{w uei4 j ia1 y i5}|wei4 jia1 yi5 但是如果使用mfa官方方法直接对拼音对其的话,由于mfa官方提供的拼音词典不正确,会导致textgrid里的phone全是spn。所以改用生成文字到音素,会避免这个错误。(不使用mfa提供的pinyin.dict 使用mfa提供的其他字典) train.txt文件的内容就会像这样:SSB01120099|SSB0112|{m ej˧˥ tʰ i˨˩˦ ʂ u˥˥ tɕ i˧˥ j ow˨˩˦ ʂ ə˧˥ n m ə˩}|媒 体...

> > > > 原作给的应该是拼音作为原句 元辅音拆开作为音素 经过处理后的train.txt内容像这样:SSB05440297|SSB0544|{w uei4 j ia1 y i5}|wei4 jia1 yi5 但是如果使用mfa官方方法直接对拼音对其的话,由于mfa官方提供的拼音词典不正确,会导致textgrid里的phone全是spn。所以改用生成文字到音素,会避免这个错误。(不使用mfa提供的pinyin.dict 使用mfa提供的其他字典) train.txt文件的内容就会像这样:SSB01120099|SSB0112|{m ej˧˥ tʰ i˨˩˦ ʂ u˥˥ tɕ i˧˥ j ow˨˩˦ ʂ ə˧˥ n m...