Patrick Devine issues

Results 32 issues of


                                            Patrick Devine

add mixtral 8x7b model conversion

This change converts Mixtral 8x7b directly into an Ollama model. The 8x22b model will added in a separate PR as it has a different structure for the way the experts...

show ggml modelinfo through the show api

This change exposes the GGML KVs and tensor data to make it easier to introspect a model.

Convert directly from llama3

This change allows you to convert directly from a llama3 derived safetensors model into Ollama. It is currently *missing*: * pytorch *almost* works however the embeddings layer size is off...

Ollama `ps` command for showing currently loaded models

This change adds a rudimentary `ps` command which makes use of the new scheduler changes in the server. The UX also The UX for this depends on whether you're using...

update go deps

Fixes #4297

add OLLAMA_NOHISTORY to turn off history in interactive mode

fixes #3002

Move the parser back + handle utf16 files

This moves the parser back to `parser/` and also adds support for decoding utf16le and utf16be files. Fixes #4503

convert safetensor adapters into GGUF

This change converts a Safetensors based LoRA into GGUF and ties it w/ a base model. Only llama2/llama3/mistral will work initially. You can create the Modelfile to look like: ```...

add line numbers for parser errors

If a Modelfile has an error in it, it's often difficult to debug where the error is located in the Modelfile itself. This change adds the line which the error...

Update the /api/create endpoint to use JSON

This PR changes the way the POST `/api/create` endpoint works by changing the way the various options/parameters get serialized and passed to the server. Currently the create endpoint requires a...