Patrick Devine comments

Results 426 comments of


                                            Patrick Devine

allow for num_ctx parameter in the openai API compatibility

Unfortunately OpenAI's API doesn't have a way to do this, and we can't modify the `num_ctx` parameter directly with their API. I did write up a [doc](https://github.com/ollama/ollama/blob/main/docs/openai.md#setting-the-context-size) which explains how...

Intel Iris Xe Graphics (16GB) not detected by Ollama v0.12.10 on Windows 11 despite Vulkan/DXGI+PDH support

@deep1305 were you able to get it to work? We will enable it by default soon; still trying to get people to try it out and report bugs.

can i use nvidia tesla k80 gpu run a model?

@mayunqing1230 unfortunately those cards are really old now and are probably not going to be very performant. I'm going to go ahead and close the issue.

ollama 0.12.6-1 doesn't detect total vram AMD GPU - due to broken Arch linux packages

I'm fairly certain this is a packaging bug in Arch as we've seen a few of those issues recently. Can you verify if installing directly works with the instructions on...

ollama 0.12.6-1 doesn't detect total vram AMD GPU - due to broken Arch linux packages

@SteavenGamerYT see the message above. Unfortunately it looks like the Arch linux package is broken (but we don't package it). You can install from the official binaries.

Ollama hangs in infinite loop during code update requests, requires service restart

@A are you hitting this when you've run through the context, or some other case?

Ollama hangs in infinite loop during code update requests, requires service restart

@A To use a larger default context you can run Ollama with: ``` OLLAMA_CONTEXT_LENGTH=8192 ollama serve ``` Here's a [link to the FAQ](https://docs.ollama.com/faq#how-can-i-specify-the-context-window-size)

Always output GGGGGGG when encountering problems that will not occur... .

Make sure you're on the latest version of Ollama and you `ollama pull glm4` to get the latest version. I just tested it w/ both linux and mac and it's...

openai: increase context window when max_tokens is provided

`max_tokens` and `num_ctx` are definitely not the same thing. I just saw the `extra_body` API change which would allow you to set `num_ctx` which seems like a better route.

multi-part model+safetensors

@werruww that's a GGUF model, not a safetensors model. That model is already available in ollama if you run `ollama run qwen2.5:7b-instruct-fp16`