nateanl
nateanl
When applying convolution on `waveforms` and `rir_waveform`, the `rir_waveform` is flipped in numpy or scipy implementations. But in speechbrain's implementation, the `rir_waveform` is directly used when [the solution is not...
# TorchAudio Beamforming Module Migration ## Overview `torchaudio` supports an integrated [`MVDR`](https://pytorch.org/audio/stable/transforms.html#mvdr) module under `torchaudio.transforms`. To use it, users need to provide `ref_channel` and `solution` (options: [`ref_channel`, `evd`, `power`]) when...
### 🚀 The feature [Hidden-Unit BERT (HuBERT)](https://arxiv.org/pdf/2106.07447.pdf?fbclid=IwAR3hI4uGqc4mV5j-ob8R5yLu-BaamVoe9ncxUoVmgFLjJXsE1IevP0rdNYY), a self-supervised model for speech representations was proposed and wildly used in down-stream tasks, such as speech recognition, speech diarization, speaker identification, etc....
### 🚀 The feature In some research cases, the Wav2Vec2 or HuBERT is expected to be frozen (i.e. make ``reuqires_grad=False`` for all params). - Users use it as a feature...
### 🚀 The feature Recently torchaudio supported mask-based MVDR beamforming module, which takes the multi-channel noisy STFT and the estimated Time-Frequency masks as the input, and generates the single-channel enhanced...
### 🚀 The feature In ``torchaudio.transforms.MVDR`` the trace of the multi-dimensional tensor is computed via a ``_get_mat_trace`` method due to the lack of PyTorch support. There is an [ongoing PR](https://github.com/pytorch/pytorch/pull/62714)...
### 🐛 Describe the bug When using the two loading methods on the same audio file, the lengths of the waveform tensors are different. I can reproduce this issue with...
### 🚀 The feature To increase the speed of `InverseMelScale` module, the SGD optimization can be replace with ` torch.linalg.lstsq`. ### Motivation, pitch The current `InverseMelScale` module applies SGD optimizer...