whispering replicating deepfilternet in streaming mode

Hey,

Great work done in building this and putting together all the pieces. One interesting thing that caught my attention was the use of deepfilternet for noise suppression.

Based on the code here: https://github.com/Sharrnah/whispering/blob/edea186b0b72d6e538944d758b9e6a82bf6e3974/Models/STS/DeepFilterNet.py

Can you please confirm if it works as expected? I see some flaws with the variable conversions but not sure. I'm looking to implement DF for one of my own project.

Any help will be hugely appreciated.

Thanks in advance

Jan 30 '25 21:01 neeagl

Hi.

I did not (yet) get DeepFilterNet to work in streaming mode because the processing time was too long so the recording could not keep up (thats at least what i think is the issue, because the resulting audio was unusable). I think its even mentioned that DeepFilterNet is not really intended for streaming mode.

But maybe it could be possible to apply DeepFilterNet when using a moving window approach when recording. But i haven't tried that yet and other people in the DeepFilterNet had no luck either so far as i could find.

So its currently limited to process the audio once it finished recording unfortunately.

Jan 30 '25 21:01 Sharrnah

Hi.

I did not (yet) get DeepFilterNet to work in streaming mode because the processing time was too long so the recording could not keep up (thats at least what i think is the issue, because the resulting audio was unusable). I think its even mentioned that DeepFilterNet is not really intended for streaming mode.

But maybe it could be possible to apply DeepFilterNet when using a moving window approach when recording. But i haven't tried that yet and other people in the DeepFilterNet had no luck either so far as i could find.

So its currently limited to process the audio once it finished recording unfortunately.

@Sharrnah thanks for your quick revert. could you suggest or advice most optimal streaming library for noise suppression? Any thoughts on rnnoise or noisereduce? Open to any other suggestions.

Jan 30 '25 21:01 neeagl

Other than DeepFilterNet i only played around with the Noisereduce algorythm. I had similar issues as with DeepFilterNet even though not as bad.

Maybe not a library, but i recommend just using NVIDIA Broadcast if you want streamed noise cancellation.

Sorry that i can't be of more help.

Jan 30 '25 21:01 Sharrnah

Thanks again. I came across nvidia broadcast as well but I need something deployable in Python or as a standalone realtime service.

Thanks again for your quick response.

Jan 30 '25 21:01 neeagl

Maybe have a look at https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/commit/a669fee786f6b9190cff2f1d809057673f257e8d

Its using noisereduce algorythm, but its very complicated written code, so will probably be difficult to extract the usage.

But its (i think) using a sliding window with crossfade

If you manage to write a simple class for this to use, let me know. :)

Jan 30 '25 21:01 Sharrnah