ospeak icon indicating copy to clipboard operation
ospeak copied to clipboard

Sped up voices have pretty terrible ring to it (or am I just imagining this?)

Open corneliusroemer opened this issue 1 year ago • 1 comments

Sped up voices have pretty terrible ring to it (or am I just imagining this?). At least the 1.1 has this issue, not sure about 1.01.

Maybe this is an issue with the upstream OpenAI models?

Here are various speeds, created like this:

ospeak "Which voice do you prefer?" -v shimmer -m tts-1-hd -x 1.1 -o 11.wav 

These were converted with ffmpeg -i 1.wav 1.mp4 so I can upload to Github issue (click to open in browser audio player)

1x speed: https://github.com/user-attachments/assets/d3b1cf69-2a56-40aa-a59f-61bb814f4478

1.01x speed: https://github.com/user-attachments/assets/11a6be73-4c80-490e-8005-6b983cb5a770

1.1x speed: https://github.com/user-attachments/assets/552e7882-d906-4525-88d7-e2118788b6aa

original wavs in zip folder: Archive.zip

corneliusroemer avatar Aug 30 '24 12:08 corneliusroemer

I get much much better results by speeding up manually with ffmpg instead of using the open ai speed setting.

ffmpeg -i 1.wav  -filter:a "atempo=1.1" 11_manual.wav

11_manual.wav.zip

corneliusroemer avatar Aug 30 '24 12:08 corneliusroemer