SpeechSplit issues

The validation loss is rising and fluctuating, is that a regular situation?

25

Greetings, thanks for such a good project. In my experiment, i used the same dataset VCTK as yours, and i had only trained for 68000-steps. The log of my experiment...

XintaoZhao0805

How to synthesis a speech which I need?

1

Hello!It’s not long for me since I just learned voice conversion.So I have many questions, and one of which is how to appoint the source speech and the target speech...

sanena

why the content is not changed?

After I tested a demo in my own data, I found that the content of the generated speech was not converted, why is this? Looking at the comparison chart, the...

sanena

how toobtain the content ,embedding,rhythm embedding?

I have two questions which maybe are just of low-level. What do "R"、"F"、"U" mean in demo.py respectively？ And how can I obtain the content embeding，pitch embedding， rhythm embedding?

Coco0422

Can it transfer RFU between different utterances?

1

Hi. Thank you for the fantastic project. Does your model is capable to transfer content, rhythm, and pitch between different sentences? I've prepared a demo.pkl file in the way that...

flagman

Does it will work on unseen data also? will it be able to convert voice of unseen speaker with different content than that of data in training, will we obtain the disentanglement?

8

pycodebook

The validation loss is rising

10

The loss of my training set looks normal, but the loss of the validation set has been rising. The loss of my training set looks normal, but the loss of...

3139725181

How to build the validation data?

4

Hello, thank you so much for the code and paper! I'm trying to train the model on [speech command data](http://storage.googleapis.com/download.tensorflow.org/data/mini_speech_commands.zip). I've made the train and validation data sets through 2...

AShoydokova

Could you please describe details of rhythm-only conversion ?

1

I don't understand how to get alignment when the input(utterance) to the rhythm-encoder is different from inputs(utterance) to pitch/content-encoders. ps(I don't understand the implementation details of variant in Appendix B.3)....

dbkest

How to align multiple sequences while they are from different source?

2

If the length of content code, rhythm code and pitch code is different from each other, how do they align since there is no attention mechanism in decoder?

inconnu11

SpeechSplit
SpeechSplit copied to clipboard

Metadata

The validation loss is rising and fluctuating, is that a regular situation?

How to synthesis a speech which I need?

why the content is not changed?

how toobtain the content ,embedding,rhythm embedding?

Can it transfer RFU between different utterances?

Does it will work on unseen data also? will it be able to convert voice of unseen speaker with different content than that of data in training, will we obtain the disentanglement?

The validation loss is rising

How to build the validation data?

Could you please describe details of rhythm-only conversion ?

How to align multiple sequences while they are from different source?

← Metadata

Owner

Metadata

SpeechSplit SpeechSplit copied to clipboard

Metadata

← Metadata

Owner

Metadata

SpeechSplit
SpeechSplit copied to clipboard