butterl
butterl
@syang1993 are those samples generated directly from Tacotron ? the audio quality is amazing
@syang1993 thanks for reaching out, I‘ve tried keithito/tacotron and Rayhane-mamah/Tacotron-2 all seems generate wav with shake & echo like @lapwing’s sample and even worse(even with wavenet 300K as vcoder), and...
@fazlekarim thanks for reaching out, I'm be very interested in your sample , because mine is much worse with other repo even trained to 400K, and now I will switch...
@fazlekarim thanks for reaching out , the wav is good ,and seem have more shaking than eval-87k-r2.zip @syang1993 shared @syang1993 I trained in my machine and the result is good,...
@syang1993 the training step is 77k I tried with two experiments on eval: 1. use_gst=True,and feed wav from the training set , the out sometimes fail(not aligned and wav is...
@syang1993 Thanks! will wait to see good result. BTW, could we put the eval mel to r9y9‘s wavenet ?
@syang1993 tried with 100K model the output is good, but the eval text would cut by “,” e.g. "he'd like to help the girl, who’s wearing the red coat." will...
@syang1993 thanks for reaching out! The train dataset is more like news or poetry (THSCH30), and the “out-of-collection” I mean is unseen text( more like colloquial statements). The gst tacotron...
Hi @cchen156 , Thanks for reaching out ! From your reply:“The data is saved in 16bit png files” the input images are RGB but is customed preprocessed ? I tried...
Hi @cchen156 , I have tried trained a hdrnet model (psnr only 21db, act good most time,but lack contrass sometime) and I want to use the same DNG testsets collected...