Technotech

Results 6 comments of Technotech

> Going to 3 bits alone is a big drop in quality If a 13B model at 3-bit could fit into VRAM, it'd probably still be better than a 7B...

> @TechnotechYT 13B at 3-bit will probably perform better than 7B at 4-bit, yes. I think it would still be a tight squeeze in 8 GB, especially if you have...

While I'm not an expert by any means, VITS in CoquiTTS is almost realtime on CPU (I tested on a medium range laptop CPU). With ggml and a good quant...

I've been working on trying to support llama.cpp via llama-cpp-python; I will see if I can make a PR soon (working through it, attention masks unsolved as llama.cpp has them,...

@iofu728 Just to update, the latest bug is with the logits. I'm not very experienced with low level PyTorch, so my guess is that this line is to focus on...

Lgtm, works well