Technotech

Results 2 issues of Technotech

Hi! While 3-bit and 2-bit quantisations are obviously less popular than 4-bit quantisations, I'm looking into the possibility of loading 13B models with 8 GB of VRAM. So far, loading...

Hi, this is an interesting project. I would like to use this with llama.cpp (llama-cpp-python more specifically), but when I had a look at the code I wasn't able to...

feature request