Patrick Devine
Patrick Devine
I'm sorta torn here because generally speaking Ctrl-H is backspace. Ctrl-Backspace will do a Ctrl-W in most desktop programs, but it doesn't do that in my terminal and just maps...
Going to close this as a dupe.
@levicki Unfortunately not yet. The dupe was to track it in #4560
@YaBoyBigPat the `-q Q4_K_M` quantize variable is to quantize a non-quantized model to that particular quantization level (i.e. fp16 or fp32). You don't need to specify it when loading in...
@Timelessprod what framework did you use to create the model? Can you provide the `ollama create` line and the Modelfile, and is it possible to get access to the weights...
This will throw an error now which will say `unsupported safetensors model`. You can just use the unquantized model directly in ollama specify the `--quantize` flag to quantize it to...
@YaBoyBigPat that's a lot of work to support Yet Another Quantization Format.
@Timelessprod take a look at the [import docs](https://github.com/ollama/ollama/blob/main/docs/import.md#quantizing-a-model) which explain how to quantize a model. To get an 8 bit quantized model create a modelfile which looks like: ``` FROM...
@Timelessprod Thanks for the clarification. The problem isn't reading in unsigned 8bit ints (that should be pretty easy to convert the uint8 values into whatever), it's more that if those...
Just an update on this: @KangInKoo the problem you're running into is there is some issue w/ the gguf file that you made. I've tried separately to fine tune the...