Text to speech not working
Self Checks
- [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template :) and fill in all the required fields.
Dify version
0.15.3
Cloud or Self Hosted
Cloud
Steps to reproduce
- Create empty chatflow
- Open the UI to enable tts feature
- click the play button for voice example
Can also be reproduced by clicking the play icon button on normal chat messages.
✔️ Expected Behavior
Dify plays audio file
❌ Actual Behavior
Infinite loading spinner Console logs:
Uncaught DOMException: MediaSource.addSourceBuffer: Type not supported in MediaSource
Cannot play media. No decoders for requested formats: audio/mpeg
I found a similar open issue regarding text-to-speech not working due to audio format compatibility with Firefox. The issue is likely because Firefox does not support the audio/mpeg format in MediaSource. Here are some suggested solutions:
- Return audio in a different format that is supported by Firefox, such as "opus", "aac", "flac", or "pcm".
- Switch to another method of playing audio that is compatible with Firefox.
- Alternatively, use a different browser like Chrome that supports the current audio format [1].
Additionally, you can change the audio format for text-to-speech output in Dify by modifying the audio_type property in the tts.yaml file [2].
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
+1. It only worked on chat mode, but not on chatflow mode
It worked for me in 0.15.2, but after upgrading to 1.0.0, stopped working entirely on Safari (spins as perzeuss notes). On Edge and Chrome the audio is broken up and sometimes distorted.
It worked for me in 0.15.2, but after upgrading to 1.0.0, stopped working entirely on Safari (spins as perzeuss notes). On Edge and Chrome the audio is broken up and sometimes distorted.
yes, me too. in 0.15.2, the text-to-speech worked, but after upgrading to 1.0.0, the sound sometimes distorted in chrome.
My question is that under diify 1.0.0, there is sound when using tongyi/tts-1.0 text to voice, and there is no sound when using SiliconFlow/fishaudio-tts text to voice. What is the reason? I wrote a model myself, which supports speech2text and tts. Its functions are normal as a tool, but the text to voice of the app has no sound and no error log is seen. My tts output wav format,and Tongyi/tts-1.0 output mp3, maybe this is reason?
I should add that I'm using OpenAI tts. I've tried both tts-1-hd and tts-1, both wish same results. Self-hosted docker deployment on Ubuntu.
Please let me know if there are any other details I can provide that would help developers troubleshoot.
Release 1.0.1 appears to fix the issue for me.
Hi, @perzeuss. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.
Issue Summary:
- The issue involves a bug in Dify version 0.15.3 where the text-to-speech feature fails with an infinite loading spinner and a "MediaSource.addSourceBuffer" error.
- The problem was noted to occur in chatflow mode but not in chat mode.
- Users reported similar issues in version 1.0.0 across multiple browsers.
- The issue has been resolved in release 1.0.1, as confirmed by @tjlindeman.
Next Steps:
- Please let us know if this issue is still relevant to the latest version of the Dify repository by commenting on this issue.
- If there is no further activity, this issue will be automatically closed in 15 days.
Thank you for your understanding and contribution!