Masking in encoder
Hi, In dataset that composed of various or very short signal length, masking in encoder showed good result in my experiments. So, I suggest that adding feature length in dataset structure to generate encoder mask. if 16000 length signal is transformed to mel-spectrogram with 512-FFT and 256 hop size, feature length might be 62 (=16000//256). And this feature length are recorded in dataset structure as feat_length. It is different with padded feature length. Then, using this feature length, encoder mask can be calculated in "conformer.py" How about this?
您好,您的邮件我已收到。我会尽快给您回复。祝好!
Hi @mystlee
Masking has been done in new version, please checkout feature_extraction.py - compute_mask
Closing this for now, feel free to reopen if you have further questions