opensmile-python icon indicating copy to clipboard operation
opensmile-python copied to clipboard

Extraction of fixed windows for LLD

Open giorgiolbt opened this issue 4 years ago • 2 comments

Hi,

I was wondering if it is possible to have a total number of windows that is fixed even when extracting LLD. At the moment, for each audio, I obtain a variable number of vectors of features that depends on the length of the audio since the window size is fixed. I would need to have for instance 200 rows for each audio independently from the audio's duration.

Thanks in advance! Giorgio

giorgiolbt avatar Apr 28 '21 10:04 giorgiolbt

No, that is not possible.

frankenjoe avatar Apr 28 '21 11:04 frankenjoe

Use zero paddings. That's the common step in speech processing. Using keras, it only needs one line to make all utterances have the same row size.

Reference:
https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/sequence/pad_sequences

bagustris avatar Jul 21 '21 06:07 bagustris