Gary Wang comments

Results 15 comments of


                                            Gary Wang

My results are not as good as the examples

Note also that the paper's dataset is the Blizzard dataset, which has alot of prosody variations in the reading, where the reader will do different style voices. This is why...

new model paper/details

My apologies, I didn't see that you've already uploaded your alternative model. Thanks for including the diagram for the architecture, will give it a spin.

new model paper/details

@fatchord Absolutely, your model converges so much faster and easier to train than all the wavenet implementations, I can get pretty fast training even on a measly gtx 1060. I...

new model paper/details

@fatchord I'm not sure if you've already done ablation studies, but I think you idea of providing the 1-D Resenet really helps with convergence quickly. Great work!

new model paper/details

@fatchord I have some compute so I'll run some higher bit runs to see whether further training/tuning will help

new model paper/details

@fatchord Given that the alternative model predicts over the bits directly (without splitting it up into coarse/fine), I'm not sure whether a softmax over 4096 classes (for 12 bit audio)...

new model paper/details

@geneing did you play around with seq_len to see what effects it has? For my experience I found that it degraded model performance with longer seq_len. However training with seq_len...

@fatchord I also made a seperate repo to refractor the code as well as add in a few things. https://github.com/G-Wang/WaveRNN-Pytorch I made attempts at training a single beta distribution (similar...

new model paper/details

Yes, I was thinking of splitting with overlaps in Mel , so so we give the model room to generate the right wav.

new model paper/details

@fatchord @geneing Here're some samples for you. https://soundcloud.com/gary-wang-23/sets/wavernn-batch I've included samples for batch synthesis, where the speed is faster than real time (around 2 seconds to generate for a 6...