EfficientAT

EfficientAT copied to clipboard

Reame
Issues

Feat: Frame-level Extraction and PyTorch API Updates

Open TioSisai opened this issue 7 months ago • 0 comments

This pull request introduces two main sets of changes: a new feature for frame-level embedding extraction and several updates to ensure compatibility with modern PyTorch versions by replacing deprecated APIs.

New Features:

Frame-level Feature Extraction:

Added a frame: bool parameter to the forward methods in both MobileNet (MN) and DyMN models.
When frame=True, the model preserves the temporal dimension during the final pooling stage, allowing for the extraction of frame-wise embeddings.
This enables more fine-grained temporal analysis, while maintaining backward compatibility with the default clip-level feature extraction.

Fixes & Maintenance:

PyTorch API Modernization:

Replaced the deprecated ConvNormActivation with the current Conv2dNormActivation.
Updated torch.stft to use return_complex=True and calculated the power magnitude with torch.square(torch.abs(x)) to align with modern complex tensor handling.
Replaced torch.cuda.amp.autocast with the more general torch.amp.autocast.

Jul 15 '25 14:07 TioSisai