Can we run NeMo MSDD Neural Diarizer model in realtime for realtime diarization?
Disclaimer I just want some inputs in developing a below mentioned pipeline, if it's feasible.
Requirements I'm currently working on a project where I need the diarization to work in a real-time manner. I tried using the PyAnnote Library but None of those could match the accuracy of MSDD but We weren't able to run MSDD in a real-time manner. Is there any example for how to do this. We need to run MSDD along with Whisper model for transcription.
check these two PRs: https://github.com/NVIDIA/NeMo/pull/5609, https://github.com/NVIDIA/NeMo/pull/7896 Although I couldn't get them to work
Thanks for your interest, no we currently don;t support online speaker diarization using MSDD.
@nithinraok thank you nithin for the confirmation