LIGHT-SERNET icon indicating copy to clipboard operation
LIGHT-SERNET copied to clipboard

MFCC hop size problem.

Open yihliang831209 opened this issue 2 years ago • 0 comments

"Good job on the paper. However, there seems to be a discrepancy regarding the frame overlaps and hop size between your text and the provided code. In your paper, it's stated that a Hamming window is used to split the audio signal into 64-ms frames with 16-ms overlaps, which are considered as quasi-stationary segments. From this, it would logically follow that the hop size is 48 ms.

However, in the hyperparameters.py file, it's stated "FRAME_STEP = 256". Given a sampling rate (fs) of 16 kHz, this implies a hop size of 16 ms, not 48 ms. Could you please clarify if there's a typographical error in the paper, or if there's a specific reason for this inconsistency?"

yihliang831209 avatar Aug 07 '23 07:08 yihliang831209