Wangyou Zhang
Wangyou Zhang
@AntoineBlanot Could you paste the content of `run.sh` and the model config file (.yaml) you used?
Thank you! I think the error is caused by the default value of the argument `load_all_speakers` (=false) and in [TSEPreprocessor](https://github.com/espnet/espnet/blob/master/espnet2/train/preprocessor.py#L1657). So it will only prepare one reference signal (corresponding to...
Thanks for your suggestion! I also considered adding SpEx series in the model pool. The major concern was indeed the multi-task training in SpEx models. It is not easy to...
Yes, I have two versions of TD-SpeakerBeam, ① one basically following the recipe's default configuration (only one modification in enroll_segment, set to 32000), and ② another using speaker embeddings instead...
> Exploring Time-Frequency Domain Target Speaker Extraction for Causal and Non-Causal Processing Is this the name of your new article? Yes, it is the paper that contains the results I...
I guess the data might have some mismatch from my setup. Could you share more details about your data? For example, the sampling rates of audios in `wav.scp`, `spk1.scp`, `enroll_spk1.scp`...
Can you also share the information about your python environment? **Basic environments:** - OS information: [e.g., Linux 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64] - python version: [e.g. 3.7.3 (default,...
Another thing you may try is running stage 8 in `run.sh` to obtain the metrics for the mixture data (`dump/raw/RESULTS.md`) so that we can check if they match the numbers...
> > Another thing you may try is running stage 8 in `run.sh` to obtain the metrics for the mixture data (`dump/raw/RESULTS.md`) so that we can check if they match...
I will take a closer look at the recipe and your setup. I will try to see if I reproduce your problem.