vaiju1981

Results 5 comments of vaiju1981

There is no way to discover what other models that are not part of ollama space ( like users space )

One can also use DJL to load GGUF locally. https://github.com/deepjavalibrary/djl/tree/master/engines/llama . We use that for local testing. It uses Llama.cpp and calls it via JNI. --Thanks and Regards Vaijanath On...

The main reason is that GGUF are small ( compared to safetensor ) and it makes our testing/usage easier. Apart from that the different quantization. Currently we are using a...

So when i meant small size, i am implying downloading from HuggingFace/repos. Downloading quantized model vs safetensors and downloading. with GGUF, one has vocabulary and other things such as prompt...

If you can provide with tensor library, I can take a stab at it. Right now in order to make llama3.2 vision to work with current code i need to...