encodec.cpp not enough space in context's memory pool

read_wav_from_disk: Number of frames read = 1577459.
ggml_new_object: not enough space in the context's memory pool (needed 39200416, available 39200096)
compress: /mnt/c/prog/fork/encodec.cpp/ggml/src/ggml.c:4858: ggml_new_object: Assertion `false' failed.
Aborted

This looks similar to the unresolved bark.cpp error: https://github.com/PABannier/bark.cpp/issues/122

Now that I have been experimenting, I believe it is due to the size of the file. If it is above 1mb it cannot fit into the buffer. So even if the file is like 10-15 seconds in length, its too big for the buffer.

Jan 09 '24 15:01 bachittle

@bachittle You are right; it stems from the file length you are trying to encode. I know where the problem comes from. The LSTM implementation is hacky: the number of nodes in the computational graph scales with the audio length. See: https://github.com/PABannier/encodec.cpp/blob/main/encodec.cpp#L278 There is not enough memory in the buffer for large audio files to fit the computational graph.

We discuss this issue here: https://github.com/ggerganov/ggml/issues/467#issuecomment-1754395527

This is a significant issue, but I never found the time to fix it.

Jan 09 '24 17:01 PABannier

In the FAQ for encodec they also mention that this library is not designed for long files as it applies the algorithm to the entire file at once: https://github.com/facebookresearch/encodec/tree/main?tab=readme-ov-file#out-of-memory-errors-with-long-files

another solution I can think of is determining when a file would cause a memory issue and splitting it into two jobs. might be slower on CPU but will allow for processing of larger files.

Jan 11 '24 21:01 bachittle

@bachittle Yes! Actually other great vocoders can do the job: Hifi-GAN for instance.

Jan 12 '24 08:01 PABannier