IndexError: index 2 is out of bounds for dimension 1 with size 2
Process Process-6: Traceback (most recent call last): File "/root/miniconda3/envs/sensevoice/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/root/miniconda3/envs/sensevoice/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/local/data2/workspace/egs_vocal_extractor/data/speech_det.py", line 156, in process_audio_task res = model.generate( File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 306, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 464, in inference_with_vad results = self.inference( File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 345, in inference res = model.inference(**batch, **kwargs) File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/models/sense_voice/model.py", line 931, in inference align = ctc_forced_align( File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/models/sense_voice/utils/ctc_alignment.py", line 45, in ctc_forced_align best_score[:, padding_num + 0] = log_probs[:, 0, blank] IndexError: index 2 is out of bounds for dimension 1 with size 2
以下是我调用的参数 model = AutoModel( model=model_dir, trust_remote_code=False, vad_model="fsmn-vad", vad_kwargs={"max_single_segment_time": 30000}, device=f"cuda:{device_id}", disable_update=True, disable_pbar=True, disable_log=True, )
res = model.generate( input=str(flac_path), cache={}, language="auto", # "zh", "en", "yue", "ja", "ko", "nospeech" use_itn=True, batch_size_s=60, merge_vad=True, # output_timestamp=True, merge_length_s=15, )
加了打印发现有
padded_t: 2, padding_num: 2, t_a_r_g_e_t_s.size(-1): 0
这样的输出
不知道把:padded_t = padding_num + t_a_r_g_e_t_s.size(-1)
改为:padded_t = max(padding_num + t_a_r_g_e_t_s.size(-1), padding_num + 1)
能不能解决问题
解决了吗
Same problem.
File "/usr/local/lib/python3.12/dist-packages/funasr/models/sense_voice/utils/ctc_alignment.py", line 45, in ctc_forced_align
best_score[:, padding_num + 0] = log_probs[:, 0, blank]
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
IndexError: index 2 is out of bounds for dimension 1 with size 2
Find a temp solution, plz see this PR.
依旧没有解决,我这边是偶发性bug。暂时没有搞清楚触发机理