bark
bark copied to clipboard
quantization
Is there a way to quantize bark? That way the model would be even smaller and faster to load and hence takes less space on GPU and results in lower latency. Thanks!
Related to #30 and this
You can find working quantized Bark examples in https://github.com/PABannier/bark.cpp and OpenVino