Old Man comments

Results 147 comments of


                                            Old Man

`ollama` process on macOS using up a lot of RAM while being idle

> @oldmanjk This issue is about MacOS I'm aware. That's why I mentioned my comment was on Linux. Sometimes issues are cross-platform

`ollama` process on macOS using up a lot of RAM while being idle

> @oldmanjk I would recommend creating a new issue for Linux, as @easp said, these comments are around MacOS. There are already several issues around this. I recommend you guys...

`ollama` process on macOS using up a lot of RAM while being idle

> We've changed to a subprocess model in the past few versions which likely resolves this when the model unloads. Are people still seeing a large footprint when idle on...

Support loading multiple models at the same time

I have a rig with three graphics cards that I would like to run three separate models on simultaneously and have them group chat

Support loading multiple models at the same time

That's what I'm currently doing (loosely), but you also have to map each instance to a specific GPU. It works, but it's very clunky to setup. A GUI would be...

Support loading multiple models at the same time

> run in docker, stick containers separately with gpu1,gpu2 or cpu only, open-webui can work with multiply ollama instances > > ``` > version: '3.8' > > services: > ollama:...

Support loading multiple models at the same time

Can we have control over which model is run on which GPU?

NikolayKozloff/Meta-Llama-3-8B-Instruct-bf16-correct-pre-tokenizer-and-EOS-token-Q8_0-GGUF

You might want to wait. I think I'm still dragging more changes out of the huggingface/meta guys. So frustrating

NikolayKozloff/Meta-Llama-3-8B-Instruct-bf16-correct-pre-tokenizer-and-EOS-token-Q8_0-GGUF

I wish I knew. What's clear to me is they haven't given this proper attention yet and I'd caution everyone to slow down. Please spread the word. I deleted my...

"server stop" and "server status" commands

A GUI would be even nicer