Patrick Devine
Patrick Devine
I just tried this: ``` pdevine@MacBook-Pro-4 ollama % ./ollama run wizardlm2:8x22b-q4_0 >>> hi there Hello! How can I assist you today? If you have any questions or need information on...
@joliss I'm using an M3 w/ 128 GB of RAM. Sonoma 14.2.1.
Hey @joliss , I think there were some changes that went in recently for metal around memory offloading, and I'm wondering if that is the issue here. These are some...
Hey @joliss , I think there is a fix for metal offloading which should be in `0.1.33`. Can you test out the [prerelease](https://github.com/ollama/ollama/releases)? I think should fix the problem.
@FlippingBinary If you just want the history to not be updated you can use the `/set nohistory` command. I actually have a change to clear the context as well, but...
With `0.1.21` you'll be able to type `/load ` which will clear your context. You can also use `/save ` which will save your conversation up until that point and...
This should work fine. Here's my output on a MBP: ``` % ./ollama run gemma:7b >>> hi there Hi there, and welcome to my chat! 👋 I'm glad you decided...
I realize *I* had the outdated version. :-D I think it's almost certainly a memory pressure issue w/ metal cc @mxyng
> @pdevine but can it be the problem here? [...] > it doesn't refuse to generate a reply or takes long to generate, but it just returns a bunch of...
@mkmohangb the team had been talking about this earlier today and we suspected that this might be the issue. Good catch. There is a fix in the 0.1.33 pre-release which...