geneing

Results 21 comments of geneing

For #1. Added direct distillation: f9357aafcb8a962a4ae7185e6e583ce5f6a42fc0 Results are pretty poor. The reason seems quite clear - the sound is highly oscillatory, but the overall phase doesn't affect the perceived sound....

@fatchord > Another thing I should mention - I tried a 10bit and 11bit signal and neither sound as good as 9bit at ~500k steps of training. It could simply...

@G-Wang > @geneing did you play around with seq_len to see what effects it has? I tried different training lengths for wavenet vocoder and found the same thing. By default...

I believe there is another problem with batch generation. The current implementation generates wave files that is a little short compared to the mel data. Wave data should be exactly...

@vcjob Do you know where we can find SSML annotated text and matching speech? Probably around 24h of good quality recordings and texts should be sufficient. If you can find...

@MXGray could you, please, upload the English model. Model in your drive reference [ref](https://drive.google.com/file/d/1AtKUeUPp95NCdve2uwXb4JYTVJb1iAt0/view?usp=sharing) doesn't generate english speech.

@MXGray Thank you. It works now.

@maozhiqiang That's unexpectedly slow. On my computer (6 year old laptop) with the same hparams it runs a little slower than real time. Let's check a few things: 1. Did...

Run ccmake or cmake-gui . Switch to advanced mode ("t" in ccmake / a checkbox in cmake-gui). Find CMAKE_BUILD_TYPE entry and type RelWithDebInfo. Find CMAKE_CXX_FLAGS_RELWITHDEBINFO and edit to include -ffast-math...

Sounds right. "Eigen3" library that I use, employs every templating trick to get best performance. When optimized, it's performance is excellent. In debug mode it is super inefficient. A few...