not enough space in context's memory pool
read_wav_from_disk: Number of frames read = 1577459.
ggml_new_object: not enough space in the context's memory pool (needed 39200416, available 39200096)
compress: /mnt/c/prog/fork/encodec.cpp/ggml/src/ggml.c:4858: ggml_new_object: Assertion `false' failed.
Aborted
This looks similar to the unresolved bark.cpp error: https://github.com/PABannier/bark.cpp/issues/122
Now that I have been experimenting, I believe it is due to the size of the file. If it is above 1mb it cannot fit into the buffer. So even if the file is like 10-15 seconds in length, its too big for the buffer.
@bachittle You are right; it stems from the file length you are trying to encode. I know where the problem comes from. The LSTM implementation is hacky: the number of nodes in the computational graph scales with the audio length. See: https://github.com/PABannier/encodec.cpp/blob/main/encodec.cpp#L278 There is not enough memory in the buffer for large audio files to fit the computational graph.
We discuss this issue here: https://github.com/ggerganov/ggml/issues/467#issuecomment-1754395527
This is a significant issue, but I never found the time to fix it.
In the FAQ for encodec they also mention that this library is not designed for long files as it applies the algorithm to the entire file at once: https://github.com/facebookresearch/encodec/tree/main?tab=readme-ov-file#out-of-memory-errors-with-long-files
another solution I can think of is determining when a file would cause a memory issue and splitting it into two jobs. might be slower on CPU but will allow for processing of larger files.
@bachittle Yes! Actually other great vocoders can do the job: Hifi-GAN for instance.