FlashTTS issues

推理时报错：TypeError: AsyncEngineArgs.init() got an unexpected keyword argument 'device'

1

如题，是否是因为vllm版本过高了呢？ ``` [FlashTTS] 2025-09-01 20:07:35 [INFO] [infer:65] >> Start up FlashTTS to perform inference. [FlashTTS] 2025-09-01 20:07:35 [INFO] [infer:66] >> Inference args: Namespace(input='./test.txt', output='demo.wav', name=None, reference_audio=None, reference_text=None, latent_file=None, model_path='./models/SparkAudio/Spark-TTS-0.5B', backend='vllm',...

Artanisax

请问有现成的docker镜像吗？

fyr233

flashtts infer 报 [ERROR] [spark_engine:354] >> Semantic tokens 预测为空

1

使用的是vllm后端，模型是spark-tts，执行 `flashtts infer -i "调整音高和语速示例。" -m ./models/spark-tts -b vllm --pitch high --speed low -o tuned.wav` 报错`spark_engine.py", line 355, in _generate_audio_tokens [rank0]: raise ValueError(err_msg) [rank0]: ValueError: Semantic tokens 预测为空，prompt：调整音高和语速示例。，llm output：!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!...

UncleLLD

voice添加默认值

能不能voice添加默认值，Dify中使用openAI的插件的时候无法添加voice 导致无法使用

OnL1on

MegaTTS3 的latent_file 是什么

8

请问一下，MegaTTS3 的latent_file 是什么，我该怎么准备这个文件？我找了一圈没找到这个文件的介绍信息。

Tian14267

流式，最快的首包延迟多少？

流式，最快的首包延迟多少？能够并发吗？支持多少路？

TTzhangheng

按照步骤安装出错

2

drake@DESKTOP-05BHGRQ:~$ flashtts serve --model_path Spark-TTS-0.5B --backend vllm --llm_device cuda --tokenizer_device cuda --detokenizer_device cuda --wav2vec_attn_implementation sdpa --llm_attn_implementation sdpa --torch_dtype "bfloat16" --max_length 32768 --llm_gpu_memory_utilization 0.6 --fix_voice --host 0.0.0.0 --port 8000 [FlashTTS] 2025-08-16...

markmars

音色复用可以换参考音频吗？

1

怎么换参考音频，自己定义新的音色？

15755841658

支持 index-tts 系列

支持 index-tts 系列，可参考 https://github.com/Ksuriuri/index-tts-vllm

shell-nlp

有计划适配openai格式的调用吗

5

包括音色注册和实时流式克隆： `import pyaudio from openai import OpenAI client = OpenAI() p = pyaudio.PyAudio() stream = p.open(format=8, channels=1, rate=24_000, output=True) with client.audio.speech.with_streaming_response.create( model="tts-1", voice="alloy", input="""I see skies of blue and clouds...

forrestsocool

FlashTTS
FlashTTS copied to clipboard

Metadata

推理时报错：TypeError: AsyncEngineArgs.init() got an unexpected keyword argument 'device'

请问有现成的docker镜像吗？

flashtts infer 报 [ERROR] [spark_engine:354] >> Semantic tokens 预测为空

voice添加默认值

MegaTTS3 的latent_file 是什么

流式，最快的首包延迟多少？

按照步骤安装出错

音色复用可以换参考音频吗？

支持 index-tts 系列

有计划适配openai格式的调用吗

← Metadata

Owner

Metadata

FlashTTS FlashTTS copied to clipboard

Metadata

← Metadata

Owner

Metadata

FlashTTS
FlashTTS copied to clipboard