huangx06 comments

Repositories
Issues
Comments

Results 4 comments of


                                            huangx06

字建模和拼音建模

我也觉得字建模准确率高不了啊。太多同音字了，模型想捕捉住正确的，不得有很强的上下文建模能力才行？

possibly training without transcripts

I think you can use ASR to convert your audios to text.

possibly training without transcripts

I don't know the exact problem of you. The training data of tacotron model is the symbol-audio pairs. You said you have audios without labeled texts. So I suggest that...

possibly training without transcripts

Yes. ASR refers to Automatic Speech Recognition but I don't think glue ASR and TTS model together would be something convenient.