NeMo
NeMo copied to clipboard
NeMo/tutorials/speaker_tasks/ASR_with_SpeakerDiarization needs confidence estimation
While performing offline ASR_with_SpeakerDiarization the function at nemo>collections>asr>parts>utils>diarization_utils.py ` def convert_word_dict_seq_to_ctm( word_dict_seq_list: List[Dict[str, float]], uniq_id: str = 'null', decimals: int = 3 ) -> Tuple[List[str], str]:
Convert word_dict_seq_list into a list containing transcription in CTM format.
Args:
word_dict_seq_list (list):
List containing words and corresponding word timestamps in dictionary format.
Example:
>>> word_dict_seq_list = \
>>> [{'word': 'right', 'start_time': 0.0, 'end_time': 0.34, 'speaker': 'speaker_0'},
{'word': 'and', 'start_time': 0.64, 'end_time': 0.81, 'speaker': 'speaker_1'},
...],
Returns:
ctm_lines_list (list):
List containing the hypothesis transcript in CTM format.
Example:
>>> ctm_lines_list= ["my_audio_01 speaker_0 0.0 0.34 right 0",
my_audio_01 speaker_0 0.64 0.81 and 0",
ctm_lines = []
confidence = 0
for word_dict in word_dict_seq_list:
spk = word_dict['speaker']
stt = word_dict['start_time']
dur = round(word_dict['end_time'] - word_dict['start_time'], decimals)
word = word_dict['word']
ctm_line_str = f"{uniq_id} {spk} {stt} {dur} {word} {confidence}"
ctm_lines.append(ctm_line_str)
return ctm_lines
` considers confidence as zero is it possible to use confidence estimation to return real confidence for each word
Thank you kind Regards