Zili Huang comments

Results 18 comments of


                                            Zili Huang

How do I run RPNSD on a single audio file with the pretrained model?

> Hi, > > Thanks for making RPNSD available. I was wondering, what if I don't want to run the experiment? Say I have a long audio file, file.wav, can...

why only one epoch for train and 10 for adapt?

Hi, for the training, the training set is quite large. The clean training set is > 3000 hours and after augmentation is > 12000 hours. I tried to train more...

parse_options.sh: No such file or directory

Hi. Sorry for my mistake, I already fix that. You can simply soft link tools/kaldi/egs/wsj/s5/utils to the main directory. parse_options.sh is one file from utils (this is a file from...

parse_options.sh: No such file or directory

First make sure that your data is 8kHz telephone data. If it is not, there might be some mismatch (since I am training on 8kHz telephone data) Of course, our...

log_softmax Error

Hi, can I know the position of this error? Also can I know your Pytorch version? This project has been written with pytorch 0.4, which is quite old. There might...

> I am getting the same error at > File "/home/RPNSD/scripts/model/faster_rcnn/faster_rcnn.py", line 125, in forward > RCNN_loss_cls_spk = F.cross_entropy(cls_score_nonzero, rois_label_nonzero) > > some inspection shows that it is loading some...

some comments

Hi Shinji, - [x] I think I already mentioned my project is largely developed based on [https://github.com/jwyang/faster-rcnn.pytorch](url) in the first line of `README.md`. Do you mean a more formal acknowledgement?...

(WIP) a better version of enhancement and separation downstream

![enh](https://user-images.githubusercontent.com/35029997/187686370-97a8cfd5-21db-4033-bc67-d2f888e90bbf.png) ![sep](https://user-images.githubusercontent.com/35029997/187686425-7e5e3a01-51d7-430d-95ff-e779fc2b8aaf.png) These two figures are comparisons of previous and new config. The top figure is for pesq value for enhancement task, and the bottom figure is for si-sdri value...

(WIP) a better version of enhancement and separation downstream

Detail changes for enhancement: - remove total training steps from 150000 to 100000 (model generally converges with 100000 iterations) - adjust the learning rate from 1e-4 to 1e-3 (compare different...

(WIP) a better version of enhancement and separation downstream

Detail changes for separation: - adjust the learning rate from 1e-4 to 1e-3 (compare different learning rates and 1e-3 is better ~0.5dB improvement) - **adjust n_fft and window_size of STFT...