Zili Huang

Results 18 comments of Zili Huang

> Hi, > > Thanks for making RPNSD available. I was wondering, what if I don't want to run the experiment? Say I have a long audio file, file.wav, can...

Hi, for the training, the training set is quite large. The clean training set is > 3000 hours and after augmentation is > 12000 hours. I tried to train more...

Hi. Sorry for my mistake, I already fix that. You can simply soft link tools/kaldi/egs/wsj/s5/utils to the main directory. parse_options.sh is one file from utils (this is a file from...

First make sure that your data is 8kHz telephone data. If it is not, there might be some mismatch (since I am training on 8kHz telephone data) Of course, our...

Hi, can I know the position of this error? Also can I know your Pytorch version? This project has been written with pytorch 0.4, which is quite old. There might...

> I am getting the same error at > File "/home/RPNSD/scripts/model/faster_rcnn/faster_rcnn.py", line 125, in forward > RCNN_loss_cls_spk = F.cross_entropy(cls_score_nonzero, rois_label_nonzero) > > some inspection shows that it is loading some...

Hi Shinji, - [x] I think I already mentioned my project is largely developed based on [https://github.com/jwyang/faster-rcnn.pytorch](url) in the first line of `README.md`. Do you mean a more formal acknowledgement?...

![enh](https://user-images.githubusercontent.com/35029997/187686370-97a8cfd5-21db-4033-bc67-d2f888e90bbf.png) ![sep](https://user-images.githubusercontent.com/35029997/187686425-7e5e3a01-51d7-430d-95ff-e779fc2b8aaf.png) These two figures are comparisons of previous and new config. The top figure is for pesq value for enhancement task, and the bottom figure is for si-sdri value...

Detail changes for enhancement: - remove total training steps from 150000 to 100000 (model generally converges with 100000 iterations) - adjust the learning rate from 1e-4 to 1e-3 (compare different...

Detail changes for separation: - adjust the learning rate from 1e-4 to 1e-3 (compare different learning rates and 1e-3 is better ~0.5dB improvement) - **adjust n_fft and window_size of STFT...