Hui Wang comments

Results 8 comments of


                                            Hui Wang

关于视频FPS的问题

谢谢回复，非常棒的工作！ Active speaker detection是当前连接speaker-related audio cues和visual cues的重要渠道之一。我能否将训练代码merge到[3D-Speaker](https://github.com/alibaba-damo-academy/3D-Speaker)中。

说话人ASR模型准确度问题（说话人识别上）

There are several limitations to speaker recognition currently in the pipeline. It may not perform well when the audio duration is too short (less than 60 seconds) or when the...

Before https://github.com/modelscope/3D-Speaker/blob/ab22112d280fed17839094da3874813c9eb63460/speakerlab/models/campplus/layers.py#L109, seg.shape[-1] is more than or equal to x.shape[-1]. Therefore, https://github.com/modelscope/3D-Speaker/blob/ab22112d280fed17839094da3874813c9eb63460/speakerlab/models/campplus/layers.py#L109 make seg and x the same shape.

小白请教一下这段代码

@Juelianqvq there is no chance for seg.shape[-1] < x.shape[-1] based on the code https://github.com/modelscope/3D-Speaker/blob/ab22112d280fed17839094da3874813c9eb63460/speakerlab/models/campplus/layers.py#L108

Optimize Laplacian computation for large matrices (significant speedup)

Thanks for spotting this bottleneck and proposing an optimization! We’ll verify the speedup and ensure numerical consistency with the original implementation.

结果中的“lm"是什么

是指large margin fine tuning，在speaker verification训练中一种比较常用提点方法

VoxCeleb2训练集训练CAM++模型收敛速度与模型ERR，与仓库中所提出的结果差距甚大，与一般的说话人识别，在voxceleb2训练结果也差距甚大，是库版本或者训练机器还是其他什么之类的错误吗？

你是否尝试过https://github.com/modelscope/3D-Speaker/blob/main/egs/voxceleb/sv-cam%2B%2B/run.sh 训练脚本，这个的结果正常吗

VoxCeleb2训练集训练CAM++模型收敛速度与模型ERR，与仓库中所提出的结果差距甚大，与一般的说话人识别，在voxceleb2训练结果也差距甚大，是库版本或者训练机器还是其他什么之类的错误吗？

训练速度和你的卡相关，如果只有一张，80min时正常的，可以关注gpu利用率