AudioSet temporally-strong labels
Firstly I would like to state that this repo is great, so many models, all in pytorch and getting them to work on my machine was very easy.
Have you tried fine-tuning the models on the temporally-strong labeled subset of the AudioSet dataset?
Hi, thanks for your interest and kind words!
Indeed we are currently working on this. I'm experimenting with a frame-wise version of DyMN and fine-tuning it on AudioSet Strong. I will update this repository with new models as soon as the experiments are conclusive.
To follow up on this: we have fine-tuned a couple of transformer models on the temporally-strong labels of AudioSet:
https://github.com/fschmid56/PretrainedSED
The models are not yet low-complexity, but we will also publish something in that regard soonish :-)
The models are not yet low-complexity, but we will also publish something in that regard soonish :-)
Looking forward to your less complex models :-)