Transforms and Vocoder ralation
Hey,
I am trying to run msanii for inpainting task on my data, and facing the issue that the result are extremely noisy.
I noticed that even if I don't run inpainter itself, but just run transforms and then __vocode, I don't get identical result. Especially if use_neural_vocoder is set to true (with just inverse transform it is less noisy). I suppose that there is some configuration I miss.
Could you please tell me if I need to fix something to make it work?
Thanks in advance!
Hello, what type of data are you working with?
I have a short .wav audio and want to inpaint it according to the mask, I specify in config
Is it a piano song?
Should you method work only with piano audios? Mine is not
Yeah. The neural vocoder was trained on pop 909 dataset rendered using fluidsynth. Thus it cannot generalize well to other data.
Got it, thank you