FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

IndexError: index 2 is out of bounds for dimension 1 with size 2

Open passerbya opened this issue 10 months ago • 5 comments

Process Process-6: Traceback (most recent call last): File "/root/miniconda3/envs/sensevoice/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/root/miniconda3/envs/sensevoice/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/usr/local/data2/workspace/egs_vocal_extractor/data/speech_det.py", line 156, in process_audio_task res = model.generate( File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 306, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 464, in inference_with_vad results = self.inference( File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/auto/auto_model.py", line 345, in inference res = model.inference(**batch, **kwargs) File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/models/sense_voice/model.py", line 931, in inference align = ctc_forced_align( File "/root/miniconda3/envs/sensevoice/lib/python3.10/site-packages/funasr/models/sense_voice/utils/ctc_alignment.py", line 45, in ctc_forced_align best_score[:, padding_num + 0] = log_probs[:, 0, blank] IndexError: index 2 is out of bounds for dimension 1 with size 2

以下是我调用的参数 model = AutoModel( model=model_dir, trust_remote_code=False, vad_model="fsmn-vad", vad_kwargs={"max_single_segment_time": 30000}, device=f"cuda:{device_id}", disable_update=True, disable_pbar=True, disable_log=True, )

res = model.generate( input=str(flac_path), cache={}, language="auto", # "zh", "en", "yue", "ja", "ko", "nospeech" use_itn=True, batch_size_s=60, merge_vad=True, # output_timestamp=True, merge_length_s=15, )

passerbya avatar Mar 20 '25 04:03 passerbya

Image 加了打印发现有 padded_t: 2, padding_num: 2, t_a_r_g_e_t_s.size(-1): 0 这样的输出 不知道把:padded_t = padding_num + t_a_r_g_e_t_s.size(-1) 改为:padded_t = max(padding_num + t_a_r_g_e_t_s.size(-1), padding_num + 1) 能不能解决问题

passerbya avatar Mar 20 '25 06:03 passerbya

解决了吗

fengyang1997 avatar Mar 24 '25 02:03 fengyang1997

Same problem.

File "/usr/local/lib/python3.12/dist-packages/funasr/models/sense_voice/utils/ctc_alignment.py", line 45, in ctc_forced_align                          
  best_score[:, padding_num + 0] = log_probs[:, 0, blank]                                                                                              
  ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^                                                                                                                       
IndexError: index 2 is out of bounds for dimension 1 with size 2

Isuxiz avatar Mar 26 '25 10:03 Isuxiz

Find a temp solution, plz see this PR.

Isuxiz avatar Mar 26 '25 11:03 Isuxiz

依旧没有解决,我这边是偶发性bug。暂时没有搞清楚触发机理

MuBai-He avatar Apr 17 '25 00:04 MuBai-He