Can it mimic a person's voice
Is there a way for it to mimic a person's voice by using that person's voice samples? Thank you.
There are ways to adapt speech models so they sound as a different person using few voice samples. Mimic does not implement any of them. These are some (fun?) things you can do:
-
You can multiply the original pitch by a factor:
-
./mimic -voice ap --setf "f0_shift=0.75" -t "lower pitch" -
./mimic -voice ap --setf "f0_shift=1.0" -t "normal voice" -
./mimic -voice ap --setf "f0_shift=1.5" -t "higher pitch"
-
-
You can set the pitch in hertz (typically males are between 85-180 Hz and females between 165-255 Hz) https://en.wikipedia.org/wiki/Voice_frequency
-
./mimic -voice ap --setf "int_f0_target_mean=200" -t "I'm mycroft speaking with higher pitch, am I not lovely?" -
./mimic -voice ap --setf "int_f0_target_mean=50" -t "I am the evil mycroft"
-
-
And also the variability of the pitch (I need to explore this better):
-
./mimic -voice ap --setf "int_f0_target_stddev=10" -t "hello world, this is a longer speech"
-
-
You can change the speed of the speech with:
--setf duration_stretch=0.8 -
And you can combine all those variables:
-
./mimic -voice ap --setf "f0_shift=1.0" --setf "int_f0_target_stddev=10" --setf duration_stretch=0.8 -t "hello world, this is a longer speech"
-
While what you want to do is not easy to implement in mycroft, if you (or anyone) is interested in playing with these values and documenting them (maybe suggesting better values or nice combinations!), pull requests would be welcome. For instance I would love that Mycroft talked in a deeper voice on halloween or something like that...