CosyVoice 新增音色后，调用server.py音色不对

执行下述代码进行保存音色，同时测试使用保存的说话人是没有问题，音色也对的上

# 保存说话人特征
success =  cosyvoice.add_zero_shot_spk(prompt_text, prompt_speech_16k, 'tim_spk2') is True
if not success:
    print(f"添加说话人 {spk_id} 失败")
cosyvoice.save_spkinfo()
# 使用保存的说话人
for i, j in enumerate(cosyvoice.inference_zero_shot('这是一段生成的声音，咋样呢', '', '', zero_shot_spk_id='tim_spk2', stream=False)):
    torchaudio.save('zero_shot2_{}.wav'.format(i), j['tts_speech'], cosyvoice.sample_rate)
cosyvoice.save_spkinfo()
print("声音克隆存储成功~")

然而通过webui.py的预训练音色或者runtime/python/grpc/server.py生成出来的音色相差很大，是怎么回事？

Oct 22 '25 09:10 zkt168

This issue is stale because it has been open for 30 days with no activity.

Nov 22 '25 02:11 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Dec 06 '25 02:12 github-actions[bot]