@aluminumbox @v3ucn @iflamed @bearlu007 @boji123 Hi 感谢各位大佬们的付出！以下是我遇到的具体问题，劳烦帮忙解答一下~

BUG 描述

当执行完【启动前的问题修复步骤】后，通过启动命令： python webui.py --model_dir ../pretrained_models/CosyVoice-300M-Instruct/ 启动，点击生成音频立即就出现了程序卡住的情况，通过等待、重试、重启等方式均无效。虽然一直没有返回结果，但资源一直是占用的，具体请看【程序卡住时的相关截图】

启动前的问题修复步骤

使用官方提供的 Dockerfile 构建时，共产生2个报错：

ImportError: cannot import name 'cached_download' from 'huggingface_hub' 此问题通过版本降级修复：pip install huggingface-hub==0.25.2
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 此问题通过升级 torch、torchaudio 修复：pip install torch==2.4.1 torchaudio==2.4.1 以下为完整报错

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/opt/conda/envs/cosyvoice/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/conda/envs/cosyvoice/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/workspace/CosyVoice/cosyvoice/cli/model.py", line 93, in llm_job
    for i in self.llm.inference(text=text.to(self.device),
  File "/opt/conda/envs/cosyvoice/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
  File "/workspace/CosyVoice/cosyvoice/llm/llm.py", line 172, in inference
    text, text_len = self.encode(text, text_len)
  File "/workspace/CosyVoice/cosyvoice/llm/llm.py", line 75, in encode
    encoder_out, encoder_mask = self.text_encoder(text, text_lengths, decoding_chunk_size=1, num_decoding_left_chunks=-1)
  File "/opt/conda/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/cosyvoice/transformer/encoder/___torch_mangle_5.py", line 22, in forward
    masks = torch.bitwise_not(torch.unsqueeze(mask, 1))
    embed = self.embed
    _0 = torch.add(torch.matmul(xs, CONSTANTS.c0), CONSTANTS.c1)
                   ~~~~~~~~~~~~ <--- HERE
    input = torch.layer_norm(_0, [1024], CONSTANTS.c2, CONSTANTS.c3)
    pos_enc = embed.pos_enc

Traceback of TorchScript, original code (most recent call last):
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
2024-10-21 12:55:14,273 ERROR Exception iterating responses: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)
Traceback (most recent call last):
  File "/opt/conda/envs/cosyvoice/lib/python3.8/site-packages/grpc/_server.py", line 589, in _take_response_from_response_iterator
    return next(response_iterator), True
  File "server.py", line 62, in Inference
    for i in model_output:
  File "/workspace/CosyVoice/cosyvoice/cli/cosyvoice.py", line 100, in inference_instruct
    for model_output in self.model.tts(**model_input, stream=stream, speed=speed):
  File "/workspace/CosyVoice/cosyvoice/cli/model.py", line 191, in tts
    this_tts_speech = self.token2wav(token=this_tts_speech_token,
  File "/workspace/CosyVoice/cosyvoice/cli/model.py", line 104, in token2wav
    tts_mel, flow_cache = self.flow.inference(token=token.to(self.device),
  File "/opt/conda/envs/cosyvoice/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/CosyVoice/cosyvoice/flow/flow.py", line 123, in inference
    token = self.input_embedding(torch.clamp(token, min=0)) * mask
  File "/opt/conda/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/opt/conda/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)

程序卡住时的相关截图

docker stats 截图

任务管理器截图

WebUI截图

启动相关信息

启动命令：python webui.py --model_dir ../pretrained_models/CosyVoice-300M-Instruct/

CosyVoice 版本：dfcd6d0a64918342582abc588af9e86eb404d05c

启动日志：

 python webui.py --model_dir ../pretrained_models/CosyVoice-300M-Instruct/
failed to import ttsfrd, use WeTextProcessing instead
/opt/conda/lib/python3.10/site-packages/diffusers/models/lora.py:393: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
  deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
2024-10-22 10:21:43,049 INFO input frame rate=50
/opt/conda/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:134: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
  WeightNorm.apply(module, name, dim)
/opt/CosyVoice/cosyvoice/dataset/processor.py:24: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend('soundfile')
/opt/CosyVoice/cosyvoice/cli/frontend.py:57: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  self.spk2info = torch.load(spk2info, map_location=self.device)
2024-10-22 10:21:44,149 WETEXT INFO found existing fst: /opt/conda/lib/python3.10/site-packages/tn/zh_tn_tagger.fst
2024-10-22 10:21:44,149 INFO found existing fst: /opt/conda/lib/python3.10/site-packages/tn/zh_tn_tagger.fst
2024-10-22 10:21:44,149 WETEXT INFO                     /opt/conda/lib/python3.10/site-packages/tn/zh_tn_verbalizer.fst
2024-10-22 10:21:44,149 INFO                     /opt/conda/lib/python3.10/site-packages/tn/zh_tn_verbalizer.fst
2024-10-22 10:21:44,149 WETEXT INFO skip building fst for zh_normalizer ...
2024-10-22 10:21:44,149 INFO skip building fst for zh_normalizer ...
2024-10-22 10:21:44,375 WETEXT INFO found existing fst: /opt/conda/lib/python3.10/site-packages/tn/en_tn_tagger.fst
2024-10-22 10:21:44,375 INFO found existing fst: /opt/conda/lib/python3.10/site-packages/tn/en_tn_tagger.fst
2024-10-22 10:21:44,375 WETEXT INFO                     /opt/conda/lib/python3.10/site-packages/tn/en_tn_verbalizer.fst
2024-10-22 10:21:44,375 INFO                     /opt/conda/lib/python3.10/site-packages/tn/en_tn_verbalizer.fst
2024-10-22 10:21:44,375 WETEXT INFO skip building fst for en_normalizer ...
2024-10-22 10:21:44,375 INFO skip building fst for en_normalizer ...
/opt/CosyVoice/cosyvoice/cli/model.py:60: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  self.llm.load_state_dict(torch.load(llm_model, map_location=self.device), strict=False)
/opt/CosyVoice/cosyvoice/cli/model.py:64: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  self.flow.load_state_dict(torch.load(flow_model, map_location=self.device), strict=False)
/opt/CosyVoice/cosyvoice/cli/model.py:67: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  hift_state_dict = {k.replace('generator.', ''): v for k, v in torch.load(hift_model, map_location=self.device).items()}
2024-10-22 10:21:47,106 DEBUG load_ssl_context verify=True cert=None trust_env=True http2=False
2024-10-22 10:21:47,107 DEBUG Using selector: EpollSelector
2024-10-22 10:21:47,108 DEBUG load_verify_locations cafile='/opt/conda/lib/python3.10/site-packages/certifi/cacert.pem'
2024-10-22 10:21:47,114 DEBUG connect_tcp.started host='api.gradio.app' port=443 local_address=None timeout=3 socket_options=None
2024-10-22 10:21:47,144 DEBUG Starting new HTTPS connection (1): huggingface.co:443
2024-10-22 10:21:47,149 DEBUG connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7feca4105cf0>
2024-10-22 10:21:47,155 DEBUG start_tls.started ssl_context=<ssl.SSLContext object at 0x7fed2d2687c0> server_hostname='api.gradio.app' timeout=3
/opt/conda/lib/python3.10/site-packages/gradio/components/base.py:201: UserWarning: 'scale' value should be an integer. Using 0.5 will cause issues.
  warnings.warn(
/opt/conda/lib/python3.10/site-packages/gradio/components/base.py:201: UserWarning: 'scale' value should be an integer. Using 0.25 will cause issues.
  warnings.warn(
/opt/conda/lib/python3.10/site-packages/gradio/layouts/column.py:55: UserWarning: 'scale' value should be an integer. Using 0.25 will cause issues.
  warnings.warn(
2024-10-22 10:21:47,501 DEBUG Using selector: EpollSelector
* Running on local URL:  http://0.0.0.0:8000
2024-10-22 10:21:47,510 DEBUG load_ssl_context verify=True cert=None trust_env=True http2=False
2024-10-22 10:21:47,510 DEBUG load_verify_locations cafile='/opt/conda/lib/python3.10/site-packages/certifi/cacert.pem'
2024-10-22 10:21:47,515 DEBUG connect_tcp.started host='localhost' port=8000 local_address=None timeout=None socket_options=None
2024-10-22 10:21:47,516 DEBUG connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7fed197af280>
2024-10-22 10:21:47,516 DEBUG send_request_headers.started request=<Request [b'GET']>
2024-10-22 10:21:47,516 DEBUG send_request_headers.complete
2024-10-22 10:21:47,517 DEBUG send_request_body.started request=<Request [b'GET']>
2024-10-22 10:21:47,517 DEBUG send_request_body.complete
2024-10-22 10:21:47,517 DEBUG receive_response_headers.started request=<Request [b'GET']>
2024-10-22 10:21:47,517 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Tue, 22 Oct 2024 10:21:47 GMT'), (b'server', b'uvicorn'), (b'content-length', b'4'), (b'content-type', b'application/json')])
2024-10-22 10:21:47,518 INFO HTTP Request: GET http://localhost:8000/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-22 10:21:47,518 DEBUG receive_response_body.started request=<Request [b'GET']>
2024-10-22 10:21:47,518 DEBUG receive_response_body.complete
2024-10-22 10:21:47,518 DEBUG response_closed.started
2024-10-22 10:21:47,518 DEBUG response_closed.complete
2024-10-22 10:21:47,519 DEBUG close.started
2024-10-22 10:21:47,519 DEBUG close.complete
2024-10-22 10:21:47,519 DEBUG load_ssl_context verify=False cert=None trust_env=True http2=False
2024-10-22 10:21:47,521 DEBUG connect_tcp.started host='localhost' port=8000 local_address=None timeout=3 socket_options=None
2024-10-22 10:21:47,522 DEBUG connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7fed197ad3c0>
2024-10-22 10:21:47,522 DEBUG send_request_headers.started request=<Request [b'HEAD']>
2024-10-22 10:21:47,522 DEBUG send_request_headers.complete
2024-10-22 10:21:47,523 DEBUG send_request_body.started request=<Request [b'HEAD']>
2024-10-22 10:21:47,523 DEBUG send_request_body.complete
2024-10-22 10:21:47,523 DEBUG receive_response_headers.started request=<Request [b'HEAD']>
2024-10-22 10:21:47,533 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'date', b'Tue, 22 Oct 2024 10:21:47 GMT'), (b'server', b'uvicorn'), (b'content-length', b'37639'), (b'content-type', b'text/html; charset=utf-8')])
2024-10-22 10:21:47,533 INFO HTTP Request: HEAD http://localhost:8000/ "HTTP/1.1 200 OK"
2024-10-22 10:21:47,533 DEBUG receive_response_body.started request=<Request [b'HEAD']>
2024-10-22 10:21:47,533 DEBUG receive_response_body.complete
2024-10-22 10:21:47,533 DEBUG response_closed.started
2024-10-22 10:21:47,534 DEBUG response_closed.complete
2024-10-22 10:21:47,534 DEBUG close.started
2024-10-22 10:21:47,534 DEBUG close.complete

To create a public link, set `share=True` in `launch()`.
2024-10-22 10:21:47,535 DEBUG Starting new HTTPS connection (1): huggingface.co:443
2024-10-22 10:21:47,595 DEBUG https://huggingface.co:443 "HEAD /api/telemetry/gradio/initiated HTTP/1.1" 200 0
2024-10-22 10:21:47,786 DEBUG start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7fed2d723160>
2024-10-22 10:21:47,786 DEBUG send_request_headers.started request=<Request [b'GET']>
2024-10-22 10:21:47,787 DEBUG send_request_headers.complete
2024-10-22 10:21:47,787 DEBUG send_request_body.started request=<Request [b'GET']>
2024-10-22 10:21:47,787 DEBUG send_request_body.complete
2024-10-22 10:21:47,787 DEBUG receive_response_headers.started request=<Request [b'GET']>
2024-10-22 10:21:47,900 DEBUG https://huggingface.co:443 "HEAD /api/telemetry/gradio/launched HTTP/1.1" 200 0
2024-10-22 10:21:47,999 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Tue, 22 Oct 2024 10:21:47 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'21'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')])
2024-10-22 10:21:48,000 INFO HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-22 10:21:48,000 DEBUG receive_response_body.started request=<Request [b'GET']>
2024-10-22 10:21:48,000 DEBUG receive_response_body.complete
2024-10-22 10:21:48,001 DEBUG response_closed.started
2024-10-22 10:21:48,001 DEBUG response_closed.complete
2024-10-22 10:21:48,001 DEBUG close.started
2024-10-22 10:21:48,001 DEBUG close.complete
2024-10-22 10:21:49,232 DEBUG start_tls.started ssl_context=<ssl.SSLContext object at 0x7fedfe5ab3c0> server_hostname='api.gradio.app' timeout=3
2024-10-22 10:21:49,842 DEBUG start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x7fee0ef84ca0>
2024-10-22 10:21:49,843 DEBUG send_request_headers.started request=<Request [b'GET']>
2024-10-22 10:21:49,843 DEBUG send_request_headers.complete
2024-10-22 10:21:49,843 DEBUG send_request_body.started request=<Request [b'GET']>
2024-10-22 10:21:49,843 DEBUG send_request_body.complete
2024-10-22 10:21:49,843 DEBUG receive_response_headers.started request=<Request [b'GET']>
2024-10-22 10:21:50,066 DEBUG receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Tue, 22 Oct 2024 10:21:49 GMT'), (b'Content-Type', b'application/json'), (b'Content-Length', b'3'), (b'Connection', b'keep-alive'), (b'Server', b'nginx/1.18.0'), (b'Access-Control-Allow-Origin', b'*')])
2024-10-22 10:21:50,067 INFO HTTP Request: GET https://api.gradio.app/gradio-messaging/en "HTTP/1.1 200 OK"
2024-10-22 10:21:50,067 DEBUG receive_response_body.started request=<Request [b'GET']>
2024-10-22 10:21:50,067 DEBUG receive_response_body.complete
2024-10-22 10:21:50,067 DEBUG response_closed.started
2024-10-22 10:21:50,067 DEBUG response_closed.complete
2024-10-22 10:21:50,067 DEBUG close.started
2024-10-22 10:21:50,067 DEBUG close.complete
2024-10-22 10:23:36,828 INFO get instruct inference request
  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]2024-10-22 10:23:36,856 INFO synthesis text 我是通义实验室语音团队全新推出的生成式语音大模型,提供舒适自然的语音合成能力。

requirements.txt

aiofiles==23.2.1
aiohappyeyeballs==2.4.3
aiohttp==3.10.10
aiosignal==1.3.1
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
anyio==4.6.2.post1
asttokens @ file:///opt/conda/conda-bld/asttokens_1646925590279/work
astunparse==1.6.3
async-timeout==4.0.3
attrs==23.1.0
audioread==3.0.1
backcall @ file:///home/ktietz/src/ci/backcall_1611930011877/work
beautifulsoup4 @ file:///croot/beautifulsoup4-split_1681493039619/work
boltons @ file:///croot/boltons_1677628692245/work
brotlipy==0.7.0
certifi @ file:///home/conda/feedstock_root/build_artifacts/certifi_1725278078093/work/certifi
cffi @ file:///croot/cffi_1670423208954/work
chardet @ file:///home/builder/ci_310/chardet_1640804867535/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
click==8.1.7
cmake==3.30.5
coloredlogs==15.0.1
conda @ file:///home/conda/feedstock_root/build_artifacts/conda_1694556045812/work
conda-build==3.24.0
conda-content-trust @ file:///tmp/abs_5952f1c8-355c-4855-ad2e-538535021ba5h26t22e5/croots/recipe/conda-content-trust_1658126371814/work
conda-package-handling @ file:///croot/conda-package-handling_1672865015732/work
conda_package_streaming @ file:///croot/conda-package-streaming_1670508151586/work
conformer==0.3.2
contourpy==1.3.0
cryptography @ file:///croot/cryptography_1677533068310/work
cycler==0.12.1
decorator @ file:///opt/conda/conda-bld/decorator_1643638310831/work
diffusers==0.30.3
dnspython==2.3.0
einops==0.8.0
exceptiongroup==1.1.1
executing @ file:///opt/conda/conda-bld/executing_1646925071911/work
expecttest==0.1.4
fastapi==0.115.2
ffmpy==0.4.0
filelock @ file:///croot/filelock_1672387128942/work
flatbuffers==24.3.25
fonttools==4.54.1
frozenlist==1.4.1
fsspec==2024.10.0
gdown==5.2.0
glob2 @ file:///home/linux1/recipes/ci/glob2_1610991677669/work
gmpy2 @ file:///tmp/build/80754af9/gmpy2_1645455533097/work
gradio==5.3.0
gradio_client==1.4.2
h11==0.14.0
httpcore==1.0.6
httpx==0.27.2
huggingface-hub==0.26.1
humanfriendly==10.0
hydra-core==1.3.2
HyperPyYAML==1.2.2
hypothesis==6.75.2
idna @ file:///croot/idna_1666125576474/work
importlib_metadata==8.5.0
importlib_resources==6.4.5
inflect==7.4.0
ipython @ file:///croot/ipython_1680701871216/work
jedi @ file:///tmp/build/80754af9/jedi_1644315229345/work
Jinja2 @ file:///croot/jinja2_1666908132255/work
joblib==1.4.2
jsonpatch @ file:///tmp/build/80754af9/jsonpatch_1615747632069/work
jsonpointer==2.1
kiwisolver==1.4.7
lazy_loader==0.4
libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work
librosa==0.10.2.post1
lightning==2.4.0
lightning-utilities==0.11.8
lit==18.1.8
llvmlite==0.43.0
markdown-it-py==3.0.0
MarkupSafe @ file:///opt/conda/conda-bld/markupsafe_1654597864307/work
matplotlib==3.9.2
matplotlib-inline @ file:///opt/conda/conda-bld/matplotlib-inline_1662014470464/work
mdurl==0.1.2
mkl-fft==1.3.6
mkl-random @ file:///work/mkl/mkl_random_1682950433854/work
mkl-service==2.4.0
modelscope==1.19.0
more-itertools==10.5.0
mpmath==1.3.0
msgpack==1.1.0
multidict==6.1.0
networkx==3.1
numba==0.60.0
numpy @ file:///work/mkl/numpy_and_numpy_base_1682953417311/work
nvidia-cublas-cu11==11.10.3.66
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu11==8.5.0.96
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu11==10.9.0.58
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu11==10.2.10.91
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu11==11.7.4.91
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu11==2.14.3
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu11==11.7.91
nvidia-nvtx-cu12==12.1.105
omegaconf==2.3.0
onnxruntime==1.19.2
openai-whisper==20240930
orjson==3.10.9
packaging @ file:///croot/packaging_1678965309396/work
pandas==2.2.3
parso @ file:///opt/conda/conda-bld/parso_1641458642106/work
pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
pickleshare @ file:///tmp/build/80754af9/pickleshare_1606932040724/work
Pillow==9.4.0
pkginfo @ file:///croot/pkginfo_1679431160147/work
platformdirs==4.3.6
pluggy @ file:///tmp/build/80754af9/pluggy_1648024709248/work
pooch==1.8.2
prompt-toolkit @ file:///croot/prompt-toolkit_1672387306916/work
propcache==0.2.0
protobuf==5.28.2
psutil @ file:///opt/conda/conda-bld/psutil_1656431268089/work
ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1609355006118/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
pure-eval @ file:///opt/conda/conda-bld/pure_eval_1646925070566/work
pyarrow==17.0.0
pycosat @ file:///croot/pycosat_1666805502580/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic==2.9.2
pydantic_core==2.23.4
pydub==0.25.1
Pygments @ file:///croot/pygments_1683671804183/work
pynini==2.1.6
pyOpenSSL @ file:///croot/pyopenssl_1677607685877/work
pyparsing==3.2.0
PySocks @ file:///home/builder/ci_310/pysocks_1640793678128/work
python-dateutil==2.9.0.post0
python-etcd==0.4.5
python-multipart==0.0.12
pytorch-lightning==2.4.0
pytz @ file:///croot/pytz_1671697431263/work
PyYAML @ file:///croot/pyyaml_1670514731622/work
regex==2024.9.11
requests @ file:///croot/requests_1682607517574/work
rich==13.9.2
ruamel.yaml==0.18.6
ruamel.yaml.clib==0.2.12
ruff==0.7.0
safetensors==0.4.5
scikit-learn==1.5.2
scipy==1.14.1
semantic-version==2.10.0
shellingham==1.5.4
six @ file:///tmp/build/80754af9/six_1644875935023/work
sniffio==1.3.1
sortedcontainers==2.4.0
soundfile==0.12.1
soupsieve @ file:///croot/soupsieve_1680518478486/work
soxr==0.5.0.post1
stack-data @ file:///opt/conda/conda-bld/stack_data_1646927590127/work
starlette==0.40.0
sympy==1.13.1
threadpoolctl==3.5.0
tiktoken==0.8.0
tn==0.0.4
tomli @ file:///opt/conda/conda-bld/tomli_1657175507142/work
tomlkit==0.12.0
toolz @ file:///croot/toolz_1667464077321/work
torch==2.4.1
torchaudio==2.4.1
torchdata @ file:///__w/_temp/conda_build_env/conda-bld/torchdata_1682362130135/work
torchelastic==0.2.2
torchmetrics==1.5.0
torchtext==0.15.2
torchvision==0.15.2
tqdm @ file:///croot/tqdm_1679561862951/work
traitlets @ file:///croot/traitlets_1671143879854/work
triton==3.0.0
typeguard==4.3.0
typer==0.12.5
types-dataclasses==0.6.6
typing_extensions==4.12.2
tzdata==2024.2
urllib3 @ file:///croot/urllib3_1680254681959/work
uvicorn==0.32.0
wcwidth @ file:///Users/ktietz/demo/mc3/conda-bld/wcwidth_1629357192024/work
websockets==12.0
WeTextProcessing==1.0.4.1
wget==3.2
whisper==1.1.10
yarl==1.16.0
zipp==3.20.2
zstandard @ file:///croot/zstandard_1677013143055/work

Oct 22 '24 10:10 lin-gooo

我不太懂这个，fastapi 版本的也会卡住吗？

Oct 22 '24 14:10 iflamed

我不太懂这个，fastapi 版本的也会卡住吗？

是的，GRPC、HTTP、WebUI 这三种方式全部会卡住。

我在7月份尝试部署了一次（commit id：02f941d34885bdb08c4cbcbb4bb8e2cecad3d430）那次的 Instruct 模型推理是完全没有问题的。

Oct 23 '24 02:10 lin-gooo

我不太懂这个，fastapi 版本的也会卡住吗？

+1，同款模型，用的fastapi，如果碰到这种情况不把api杀死，甚至会影响整张卡。我用的八卡A800部署32线程，出现这个问题的api会让卡满载，以至于这张卡的所有其他api请求速度巨慢甚至也卡死。而且跟数据无关，我用导致卡死的数据请求其他api或者重新部署后再次请求都没有问题。

Oct 23 '24 08:10 jialeuuz

我不太懂这个，fastapi 版本的也会卡住吗？

+1，同款模型，用的fastapi，如果碰到这种情况不把api杀死，甚至会影响整张卡。我用的八卡A800部署32线程，出现这个问题的api会让卡满载，以至于这张卡的所有其他api请求速度巨慢甚至也卡死。而且跟数据无关，我用导致卡死的数据请求其他api或者重新部署后再次请求都没有问题。

这种情况你是偶现的是嘛？我尝试了 GRPC、HTTP、WebUI 这三种方式都是必现的，从来没成功合成出音频。

Oct 23 '24 09:10 lin-gooo

我不太懂这个，fastapi 版本的也会卡住吗？

+1，同款模型，用的fastapi，如果碰到这种情况不把api杀死，甚至会影响整张卡。我用的八卡A800部署32线程，出现这个问题的api会让卡满载，以至于这张卡的所有其他api请求速度巨慢甚至也卡死。而且跟数据无关，我用导致卡死的数据请求其他api或者重新部署后再次请求都没有问题。

这种情况你是偶现的是嘛？我尝试了 GRPC、HTTP、WebUI 这三种方式都是必现的，从来没成功合成出音频。

我是批量跑几万条数据，每次跑到最后都会发现卡主一个api，重启后用剩下的数据继续跑，还是会卡住，直到只剩一条数据还是会卡。而且似乎api换成fish就没问题。

Oct 23 '24 10:10 jialeuuz

我不太懂这个，fastapi 版本的也会卡住吗？

+1，同款模型，用的fastapi，如果碰到这种情况不把api杀死，甚至会影响整张卡。我用的八卡A800部署32线程，出现这个问题的api会让卡满载，以至于这张卡的所有其他api请求速度巨慢甚至也卡死。而且跟数据无关，我用导致卡死的数据请求其他api或者重新部署后再次请求都没有问题。

这种情况你是偶现的是嘛？我尝试了 GRPC、HTTP、WebUI 这三种方式都是必现的，从来没成功合成出音频。

我是批量跑几万条数据，每次跑到最后都会发现卡主一个api，重启后用剩下的数据继续跑，还是会卡住，直到只剩一条数据还是会卡。而且似乎api换成fish就没问题。

真的很奇怪 [摊手]

Oct 24 '24 02:10 lin-gooo

同样遇到这个问题，莫名其妙就会卡住，想批量生成1000多章的有声书，一章卡一次

Oct 24 '24 06:10 limujun

同样遇到这个问题，莫名其妙就会卡住，想批量生成1000多章的有声书，一章卡一次

请教下 instruct_text 参数有什么作用吗？我看代码里实际上是把 tts_text 和 instruct_text 拼接在一起来生成音频的，但最终结果似乎是在开盲盒，不是缺少了一部分文字的音频，就是会出现两个音频一起被生成的情况。

Oct 24 '24 07:10 lin-gooo

同样遇到这个问题，莫名其妙就会卡住，想批量生成1000多章的有声书，一章卡一次

请教下 instruct_text 参数有什么作用吗？我看代码里实际上是把 tts_text 和 instruct_text 拼接在一起来生成音频的，但最终结果似乎是在开盲盒，不是缺少了一部分文字的音频，就是会出现两个音频一起被生成的情况。

确实是去拼接在一起的，但是应该不会出现两个一起生成，我没遇到过

Oct 25 '24 05:10 limujun

同样遇到这个问题，莫名其妙就会卡住，想批量生成1000多章的有声书，一章卡一次

请教下 instruct_text 参数有什么作用吗？我看代码里实际上是把 tts_text 和 instruct_text 拼接在一起来生成音频的，但最终结果似乎是在开盲盒，不是缺少了一部分文字的音频，就是会出现两个音频一起被生成的情况。

确实是去拼接在一起的，但是应该不会出现两个一起生成，我没遇到过

我主要是用的zero_shot推理

Oct 25 '24 05:10 limujun

This issue is stale because it has been open for 30 days with no activity.

Nov 25 '24 02:11 github-actions[bot]

同遇到模型正常推理一段时间后，接下来新的api请求会阻塞不返回，cosyvoice2 use_trt

Oct 18 '25 04:10 LemonCANDY42

我不太懂这个，fastapi 版本的也会卡住吗？

+1，同款模型，用的fastapi，如果碰到这种情况不把api杀死，甚至会影响整张卡。我用的八卡A800部署32线程，出现这个问题的api会让卡满载，以至于这张卡的所有其他api请求速度巨慢甚至也卡死。而且跟数据无关，我用导致卡死的数据请求其他api或者重新部署后再次请求都没有问题。

这种情况你是偶现的是嘛？我尝试了 GRPC、HTTP、WebUI 这三种方式都是必现的，从来没成功合成出音频。

我是批量跑几万条数据，每次跑到最后都会发现卡主一个api，重启后用剩下的数据继续跑，还是会卡住，直到只剩一条数据还是会卡。而且似乎api换成fish就没问题。

请问后来解决了吗？

Oct 18 '25 04:10 LemonCANDY42

使用 CosyVoice-300M-Instruct 模型推理，出现程序卡住现象

BUG 描述

启动前的问题修复步骤

程序卡住时的相关截图

启动相关信息