FENRlR
FENRlR
Based on your [fork](https://github.com/KevinWang676/VITS2-Chinese), I've found that you have correct symbols for the cleaner. But I've also found that you have identical texts for [train](https://github.com/KevinWang676/VITS2-Chinese/blob/main/filelists/short_character_anno_train.list.cleaned) and [validation](https://github.com/KevinWang676/VITS2-Chinese/blob/main/filelists/short_character_anno_val.list.cleaned) text. The datasets...
Indeed yes. Actually, I forgot to update the notebook version of inference. You can comment out langdetector like [this](https://github.com/FENRlR/MB-iSTFT-VITS2/blob/main/inference.py#L65).
I really have no clue about reproducing this error. [It seems, however, someone had already encountered situations of such before ([PS2]).](https://webcache.googleusercontent.com/search?q=cache:snQX9nZM4EEJ:https://blog.csdn.net/weixin_44649780/article/details/133904548)
A huge thank you for sharing the results. The main reason of using iSTFT here was its fast synthesis speed that it showed from its original VITS variant. As so,...
Currently, no. It seems there were some issues with 16kHz sampling rate in the original iSTFT repo. I've never seen the other two, however.
@Insensiblee Before reverting back to that commit, have you tried [changing symbols?](https://github.com/FENRlR/MB-iSTFT-VITS2/blob/main/text/symbols.py#L82) The length of symbols he used for Russian is exactly 155, while 205 is the length of the...
It would not be exactly the same with 50k under 32 batches. But since I haven't tried 16 batch setting, there's not much that I can tell.
Alongside with p0p4k's suggestion, I found there was an issue of getting slower duration by reducing the sampling rate to 16000Hz. https://github.com/MasayaKawamura/MB-iSTFT-VITS/issues/7#issuecomment-1664896030 Also, there are repos especially targeting the 44100...
@w11wo Thank you for the detailed explanation. But I still wonder if it is okay to go with those settings from original vits because `"upsample_rates": [4,4]`([[8,8] without `"subbands":4`](https://github.com/MasayaKawamura/MB-iSTFT-VITS/blob/main/configs/ljs_istft_vits.json)) and `"upsample_kernel_sizes":...
Result of using stochastic duration predictor was in question as I've only tested without it so far. Thank you for the results!