descript-audio-codec icon indicating copy to clipboard operation
descript-audio-codec copied to clipboard

byte count on 16kHz decoding

Open lonce opened this issue 2 years ago • 7 comments

Hi, I am getting an error on decoding when I use "16khz". For my two second files, the original length is 3200 byres, but the reconstruction comes up 8 bytes short:

File "/home/lonce/working/descript-audio-codec/dac/model/base.py", line 289, in decompress recons.audio_data = recons.audio_data.reshape( RuntimeError: shape '[-1, 1, 32000]' is invalid for input of size 31992

I can "fix" the error by just hard-coding the the length argument to the reshape operation (on line 289 in body.py) to 3192. For the general fix, I suppose the reshape should be given the length of the recon signal, not the original signal. Or else the reconstruction should produce the exact same number of bytes as the original files.

This is using code pulled from github on 2023.08.17.

p.s. Nice work on the codec - it sounds great, the compression is amazing, and the git docs easy to understand!

lonce avatar Aug 19 '23 15:08 lonce

Screen-2023-08-21_15-08-19

Just a bit more information to illustrate the issue. You can see that the original signal is 32000 samples, but after encoding/decoding, the length is 31992.

Thanks again.

lonce avatar Aug 21 '23 13:08 lonce

I just noticed this only happens with model.encode/decode (and with dac encode / dac decode), not with model.compress/decompress.

lonce avatar Sep 05 '23 05:09 lonce

I have encountered a similar issue when working with a 5-second audio track at 24KHz. It appears that there is consistently an 8-sample loss. It leads me to suspect that the decoder in the model may not be adequately padded.

barneymaydance avatar Sep 26 '23 18:09 barneymaydance

I encounter the same issue. Also an 8-sample loss for 1 second audio. Does anyone solve this issue without hard-coding ?

Stanwang1210 avatar Nov 14 '23 06:11 Stanwang1210

Any update? It's stupid that after I compress the audio, using the same model, I can't decompress it because of the length input mismatch error. @pseeth @eeishaan

heilrahc avatar Sep 02 '24 07:09 heilrahc