Ping
Ping
So why the first dimension is not the same? And I use the mel feature whose feature is (385, 80) and your model, your speaker embedding in "metadata.pkl" to generate...
I use your code and your parameter in issue #4 to generate the mel feature, the hop_size is 256 and the result dimension is (385, 80). The code is below....
I have another question: I use the following code to replace the soundfile to read the data x, fs = librosa.load(os.path.join(dirName, subdir, fileName), sr=16000) However, the final dimension is (129,...
> > > > Thank you! > > ``` > Hello, I met the same question as you. I'd like to generate my own "metadata.pkl" file to convert the voice...
> > > > This is because I use the dataset VCTK corpus download from https://datashare.ed.ac.uk/handle/10283/2950. In there, I do not find the sound whose frequency is 16K, so I...