TensorFlowASR icon indicating copy to clipboard operation
TensorFlowASR copied to clipboard

Masking in encoder

Open mystlee opened this issue 2 years ago • 1 comments

Hi, In dataset that composed of various or very short signal length, masking in encoder showed good result in my experiments. So, I suggest that adding feature length in dataset structure to generate encoder mask. if 16000 length signal is transformed to mel-spectrogram with 512-FFT and 256 hop size, feature length might be 62 (=16000//256). And this feature length are recorded in dataset structure as feat_length. It is different with padded feature length. Then, using this feature length, encoder mask can be calculated in "conformer.py" How about this?

mystlee avatar Apr 18 '23 05:04 mystlee

您好,您的邮件我已收到。我会尽快给您回复。祝好!

Aegon007 avatar Apr 18 '23 05:04 Aegon007

Hi @mystlee

Masking has been done in new version, please checkout feature_extraction.py - compute_mask

Closing this for now, feel free to reopen if you have further questions

nglehuy avatar May 04 '24 17:05 nglehuy