phasen
phasen copied to clipboard
A unofficial Pytorch implementation of Microsoft's PHASEN
不知道作者有没有计划开源模型,让我们更方便的使用
It throw the error that: operands could not be broadcast together with shapes (514,399) (257,397)
你好,感谢您的复现工作,不过我使用自己的数据训练该模型,loss不会下降,请问我该如何排查原因? 我的数据为中英文均包含的干净录音,添加musan噪声后作为训练数据,使用mixloss,mixloss值稳定在40,sisnr值稳定在7~8之间,且不会下降和提升。
你号,音频分成4秒每段进行语音增强后,在音频的连接处有哒哒的声音或者会出现消音的情况,将4s改成1s后的效果更加严重,这种情况可以采用什么方式去除呢?产生的原因是因为音频不连续吗?
大佬,我用的是Mixloss,一运行loss就 nan. 1、LR 我已经设置很小了(0.00001); 2、没有/0 情况;请问还有可能是什么原因呢?
想问下这个模型较好的拟合,loss值要接近多少,用的是-5-20信噪比的aishell数据,目前相位loss有点大
I am trying to reproduce the PHASEN, but I have a problem about data preprocessing. When the audio signal time is less than 4 seconds, what should I do? I...
I want to run this script, but my computer does not have a GPU. I tried to use the CPU to train, but it failed. How can it be compatible...
Hi,I use tensorflow to conv_stft like this: def init_kernels(win_len, win_inc, fft_len, win_type=None, invers=False): if win_type == 'None' or win_type is None: window = np.ones(win_len) else: window = get_window(win_type, win_len, fftbins=True)**0.5...
I got "Nan" when use Mix loss to train (not speech denoise task), and Fix it by adding grad clip as fellows: loss.backward() nn.utils.clip_grad_norm_(self.estimator.parameters(), 10.0) # add this to clip...