Resemblyzer icon indicating copy to clipboard operation
Resemblyzer copied to clipboard

changing <partials_n_frames> to reduce partial utterances length and increase resolution (diarization with spectral clustering)

Open dcanones opened this issue 4 years ago • 3 comments

Hi!

I am trying to implement the paper: https://arxiv.org/pdf/1710.10468.pdf to create an unsupervised diarization algorithm using the d-vectors provided by the pre-trained model in Resemblyzer.

I found that the length of the partial utterances (1.6s), determined by the hyperparameter partials_n_frames with a default value 160 may be too high. In the paper, the authors recommend a window size and step of 240ms and 120ms for this kind of diarization, respectively.

Is this parameter something that can be changed easily? As it is implemented as a setting in the source (hyperparams.py) code and not as an argument of a function or method it looks like it is not a good idea to modify it.

Thanks in advance.

David.

dcanones avatar Mar 02 '21 21:03 dcanones

did u change mel window length to 240 and mel window step to 120??

sourav1122 avatar Mar 23 '21 09:03 sourav1122

I have the same question as I had resolution issues while implementing the same paper. I'm a little confused why partials_n_fames isn't a changeable parameter. Have you tried changing it?

hbq-ruc avatar Jul 23 '21 01:07 hbq-ruc

Who said the partials_n_fames can not be changed? If my partial utterance duration is 400ms (default is 1.6 seconds), I would make the rate to be 2.5 and then change the value of partials_n_frames = 40, so that mel_window_step * partials_n_frames == partial utterance duration

kafan1986 avatar May 07 '22 04:05 kafan1986